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MODELING THE METALLICITY DISTRIBUTION OF GLOBULAR CLUSTERS 
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ABSTRACT 

Observed metallicities of globular clusters reflect physical conditions in the interstellar medium of their high- 
redshift host galaxies. Globular cluster systems in most large galaxies display bimodal color and metallicity 
distributions, which are often interpreted as indicating two distinct modes of cluster formation. The metal-rich 
and metal-poor clusters have systematically different locations and kinematics in their host galaxies. However, 
the red and blue clusters have similar internal properties, such as the masses, sizes, and ages. It is therefore 
interesting to explore whether both metal-rich and metal-poor clusters could form by a common mechanism 
and still be consistent with the bimodal distribution. We present such a model, which prescribes the formation 
of globular clusters semi-analytically using galaxy assembly history from cosmological simulations coupled 
with observed scaling relations for the amount and metallicity of cold gas available for star formation. We 
assume that massive star clusters form only during mergers of massive gas-rich galaxies and tune the model 
parameters to reproduce the observed distribution in the Galaxy. A wide, but not entire, range of model real- 
izations produces metallicity distributions consistent with the data. We find that early mergers of smaller hosts 
create exclusively blue clusters, whereas subsequent mergers of more massive galaxies create both red and blue 
clusters. Thus bimodality arises naturally as the result of a small number of late massive merger events. This 
conclusion is not significantly affected by the large uncertainties in our knowledge of the stellar mass and cold 
gas mass in high-redshift galaxies. The fraction of galactic stellar mass locked in globular clusters declines 
from over 10% at z > 3 to 0.1% at present. 

Subject headings: galaxies: formation — galaxies: star clusters — globular clusters: general 



1. INTRODUCTION 

A self-consistent description of the formation of globular 
clusters remains a challenge to theorists. A particularly puz- 
zling observation is the apparent bimodality, or even multi- 
modality, of the color distribution of globular cluster systems 
in galaxies ranging from dwarf dis ks to giant ellipticals (re- 
viewed by Brodie & Straderl 12006). This color bimodality 
likely translates into a bimodal distribution of the abundances 
of heavy elements such as iron. We know this to be the case in 
the Galaxy as well as in M31, where relatively accurate spec- 
tral measurements exist for a large fraction of the clusters. In 
this paper we will interchangeably refer to metal -poor clusters 
as blue clusters, and to metal-rich clusters as red clusters. 

Bimodality in the globular cluster metallicit y distribution of 
lumino us elliptical galaxies was proposed by Zepf & Ashman 
( 19931) . following a theoretical model of Ashman & Zepf 
(1992). The concept of cluster bimodality became univer- 
sally accepted because the two populations also differ in other 
observed characteristics. The system of red clusters has a 
significant rotation velocity similar to the disk stars whereas 
blue clusters have little rotational support, in the three disk 
galax ies observed in detail: Milky Way, M31, and M33 (Zinn 
1985). In elliptical galaxies, blue clusters have a higher ve- 
locity dispersion than red clusters, both due to lack of ro- 
tation and more extended spatial distribution. Red clusters 
are usually more spatia lly concentrated than blue clusters 
(Brodie & Strader 2006). All of these differences, however, 
are in external properties (location and kinematics), which re- 
flect where the clusters formed, but not how. The internal 
properties of the red and blue clusters are similar: masses, 
sizes, and ages, with only slight differences. Even the metal- 
licities themselves differ typically by a factor of 10 between 



the two modes, not enough to affect the dynamics of molecu- 
lar clouds from which these clusters formed. Could it be then 
that both red and blue clusters form in a similar way on small 
scales, such as in giant molecular clouds, while the differ- 
ences in their metallicity and spatial distribution reflect when 
and where such clouds assemble? 

All scenarios proposed in the literature assumed differ- 
ent formation mechanisms for the red and blue clusters, 
and most scenarios envisioned the stellar population of 
one mode to be tightly linked to that of the host galaxy 
(e.g.JForbes et alJIl 997t iCote et al]fl998T: iStrader et al.1120051: 
Griffe rTet alj 1201 UK The other mode is assumed to have 
formed differently, in some unspecified "primordial" way. 
This assumption only pushed th e problem back in t ime but 
it did not solve it. For example, iBeasley et alJ ((2002) used a 
semi-analytical model of galaxy formation to study bimodal- 
ity in luminous elliptical galaxies and needed two separate 
prescriptions for the blue and red clusters. In their model, 
red clusters formed in gas-rich mergers with a fixed efficiency 
of 0.007 relative to field stars, while blue clusters formed 
in quiescent disks with a different efficiency of 0.002. The 
formation of b lue clusters also h a d to be artif i cially trun- 
cated at z = 5 IStrader etalj ( 12005b , iRhode et al.l d2005l) . and 
GriffenetaUOOToI suggested that the blue clusters could in- 
stead have formed in very small halos at z > 10, before cos- 
mic reionization removed cold gas from such halos. This 
scenario requires high efficiency of cluster formation in the 
small halos and also places stringent constraints on the age 
spread of blue clusters to be less than 0.5 Gyr. Unfortunately, 
even the mo st recent measurements of relative cluster ages in 
the Galaxy dDe Angeli et alj|2005t iMarfn-Franch et alj|2009t 
iDotter et alj|2010h cannot detect age differences smaller than 
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9%, or about 1 Gyr, and therefore cannot su pport or falsify 
the reionization scenario. iDotter et"ail |2010) also show that 
the red clusters have larger dispersion of ages (15%, or about 
2 Gyr) and those located outside 15 kpc of the Galactic cen- 
ter tend to show measurably l ower ages, by as much as 50% 
(or 6 Gyr). In addition. IStrader et al.rd2009h find that the red 
clusters in M3 1 have lower mass-to-light ratios than the blue 
clusters, possibly indicating an age variation. 

In this paper we set out to test whether a common 
mechanism could explain the formation of both modes and 
produce an entire metallicity distribution consistent with 
the observations. We begin with a premise of the hi- 
erarchical galaxy formation in a ACDM universe. Hub- 
ble Space Telescope observations have convincingly demon- 
strated one of the likely formation routes for massive star 
clusters today - in t he mergers of g as-ric h galaxies (e.g., 
Holtzman et~aT]ll992t lO'Connell et atlll995t IWhitmore et all 
1999; Zep f et al.l 19991) . We adopt this single formation mech- 
anism for our model and assume that clusters form only dur- 
ing massive gas-rich mergers. We follow the merging process 
of progenitor galaxies in a Galaxy-sized environment using a 
set of cosmological A^-body simulations. We need to specify 
what type and how many clusters will form in each merger 
event. For this purpose, we use observed scaling relations to 
assign each dark matter halo a certain amount of cold gas that 
will be available for star formation throughout cosmic time 
and an average metallicity of that gas. In order to keep the 
model transparent, we choose as simple a parametrization of 
the cold gas mass as possible. Finally, we make the simplest 
assumption that the mass of all globular clusters formed in the 
merger is linearly proportional to the mass of this cold gas. 

Although such model appears extremely simplistic, we 
have some confidence that it ma y capture main elements of 
the formation of massive clusters. [Rravtsov & Gnediiil (2005) 
used a cosmological hydrodynamic simulation of the Galactic 
environment at high redshifts z > 3 and found dense, massive 
gas clouds within the protogalactic disks. If the high-density 
regions of these clouds formed star clusters, the resulting dis- 
tributions of cluster mass, size, and metallicity are consistent 
with those of the Galactic metal-poor clusters. In that model 
the total mass of clusters formed in each disk was roughly 
proportional to the available gas mass, Mqq oc M g , just as we 
assume here. 

We tune the parameters of our semi-analytical model to re- 
produce the metallicity distributi on of the Galactic globular 
clusters, as compiled by iHarrisi (11996). This distribution is 
dominated by the metal-poor clusters but is also significantly 
bimodal. We attempt to construct a model without explic- 
itly differentiating the two modes and test if bimodality could 
arise naturally in the hierarchical framework. 

We adopt a working definition of red clusters as having 
[Fe/H] > -1 and blue clusters as having [Fe/H] < -1. This 
definition should also roughly apply to extragalactic globu- 
lar cluster systems. We use the concordance cosmology with 
fi = 0.3, n A = 0.7, h = 0.7. 

2. PRESCRIPTION FOR GLOBULAR CLUSTER 
FORMATION 

2.1. Cold Gas Fraction 

We follow the merging process of protogalactic dark matter 
halos using cosmological jV-bod y simulations o f three Milky 
Way-sized systems described in iRravtsov et alj (120041) . The 
simulations were run with the Adaptive Refinement Tree code 



dKravtsov et al.lfT997h in a 25 h~ l Mpc box. Specifically, we 
use merger trees for three large host halos and their corre- 
sponding subhalo populations. The three host halos contain 
~ 10 6 dark matter particles and have virial masses ~ 10 12 M Q 
at z = 0. Two halos are neighbors, located at 600 kpc from 
each other. The configuration of this pair resembles that of 
the Local Group. The third halo is isolated and is located 
2 Mpc away from the pair. All three systems experience no 
major mergers at z < 1 and thus could host a disk galaxy like 
the Milky Way. 

In addition to the host halos, the simulation volume con- 
tains a large number of dwarf halos that begin as isolated sys- 
tems and then at some point accrete onto the host halo. Some 
of these satellites survive as self-gravitating systems until the 
present, while the rest are completely disrupted by the tidal 
forces. We allow both the host and the satellite systems to 
form globular clusters in our model. 

We adopt a simp le hypothesis, motivated b y the hydrody- 
namic simulation of lKravtsov & Gnedin (2005), that the mass 
in globular clusters, Mqq, that forms in a given protogalactic 
system is directly proportional to the mass of cold gas in the 
system, M g . We define the corresponding mass fraction, f g , 
of cold gas that will be available for star cluster formation in 
a halo of mass M/, as 



where fb ~ 0.17 is the universal baryon fraction 
( Ko matsu et al.ll2010l) . 

The gas fraction cannot exceed the total fraction of baryons 
accreted onto the halo, which is limited by external photoheat- 
ing and depends on the cutoff mass M c (z): 

fin= (l+M c (z)/M h f (2) 

We use an updated version of t he cutoff ma ss as a function of 
redshift (originally defined by Gnedin 2000). base d on our fit - 
ting of the r e sults o f recent simul a tions b y Hoeft et al. (2006), 
ICrain et al.l (120071) . iTassis et all (120081) . and lOkamoto et alJ 
(120081) : 

M c (z) ~ 3.6 x ioV a6(1+z) r'M . (3) 

Given the scatter in simulation results and the numerical lim- 
itations of the modeling of gas physics, a reasonable uncer- 
tainty in this mass estimate is of the order 50%. However, 
the resulting cluster mass and metallicity distributions are not 
very sensitive to th e exact form of this equation. Note that 
Orban ^FaT] (120081) provided an earlier revision of the equa- 
tion for M c (z)\ our current form is more accurate. If M c (z) 
falls below the mass of a halo with the virial temperature of 
10 4 K, we set M c (z) equal to that mass: 

M c . min (z) = M 4 = 1 .5 x 10 10 A; ; y 2 /r 1 M Q , (4) 

where A v ; r = 180 is the virial overdensity and H(z) is the Hub- 
ble parameter at redshift z- This criterion ensures that we only 
select halos able to cool efficiently via atomic hydrogen re- 
combination lines. 

Some of the baryons accreted onto a halo may be in a warm 
or hot phase (at T > 10 4 K) unavailable for star formation, 
thus f g < fi„ < 1 . We assume that only the gas in cold phase 
(T <C 10 4 K) is likely to be responsible for star cluster forma- 
tion. The cold gas fraction f g is calculated by combining sev- 
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eral ob served scaling relations. From the results of McGaugh 
( 12005b . the average gas-to-stellar mass ratio in nearby spiral 
and dwarf galaxies can be fitted as 



Mg 



-0.7 
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where M s is a characteristic scale mass, which we found to be 
M s (z = 0) s» 4 x 10 9 M Q . This relation saturates at low stel- 
lar masses, where f g cannot exceed /,-„. At higher redshift 
the only informatio n on the gas co ntent of galaxies comes 
from the study by Erb et al. (2006) of Lyman break galax- 
ies at z = 2. These authors estimated the cold gas mass by 
inverting the Kennicutt-Schmidt law and using the observed 
star formation rates. These estimates are fairly uncertain 
and model-dependent. Within the uncertainties, their results 
can be fitted by the same formula but with a different scale 
mass: M s (z = 2) sa 2 x 1O 1O M . To extend this relation to all 
epochs, we employ a relation that interpolates the two values: 



M s (z) « 10' 



9.6+0.35; 



(6) 



An additional scaling relation is needed to complement 
equation (0 with a prescription for stellar mass as a func- 
tion of halo mass. We compile it by combining the observed 
stellar mass-circular velocity correlation with the theoreti- 
cal circular velocity-halo mass correlation. IWoo et ail (12008b 
found that the stellar mass of the dwarf galaxies in the Lo- 
cal Group correlates with their circular velocities, which are 
taken as the rotation velocity for the irregular galaxies or 
the appropriately scaled velocity dispersion for the spheroidal 
galaxies. In the range 1O 7 M0 < < 10 10 M Q , appropriate 
for the systems that may harbor globular clusters, the cor- 
relation is V c oc MS' 27 * ' 01 . This can be inverted as pa 
1.6 x 10 9 Mq (K/lOOkms -1 )" 

Cosmological TV-body simulations show that dark matter 
halos, both isolated halos and satellites of larger halos, ex- 
hibit a robust correlation betw een their mass and m aximum 
circular velocity (e.g., Fig. 6 of lKravtsov et alj[2004l) : V max pa 
100 (M/,/1.2 x 10" M ) - 3 kms" 1 . This maximum circu- 
lar velocity of dark matter is typically lower than the rota- 
tion velocity of galaxies because of the contribution of stars 
and gas. To connect the two velocities, we apply the cor- 
rection V c = v^Vmax, which reflects the observation that the 
mass in dark matter is approximately equal to the mass in 
stars over the portions of galaxies that contain the majority 
of stars. Then the equations in the last two paragraphs lead to 
M* pa 5.5 x lO lo (M ft /iO 12 M ) u M Q . 

We also need to ex t end t his local relation to other redshifts. 
IConrov & Wechsler] (120091) matched the observed number 
densities of galaxies of given stellar mass with the predicted 
number densities of halos of given mass from z = to z ~ 2, 
averaged over the whole observable universe. They find that 
the stellar fraction /* peaks at masses M h ~ 1O 12 M and de- 
clines both at lower and higher halo masses. The range of 
masses of interest to us is below the peak, where we can ap- 
proximate /* dependen ce on halo mass as a power- law. The 
results from Fig. 2 of IConrov & Wechslerl (120091) are best 
fit by a steeper relation than we derived for the Local Group 
and also show significant variation with redshift at lower halo 
masses ~ 1O H M : M* cx M^ 5 (I + z)~ 2 (there is much less 
variation with time around the peak at 10 12 M , implying only 
a weak evolution in the total stellar density at z < 1 ). We adopt 
the same redshift dependence to our local relation, while using 



the shallower slope derived from I Woo et alj (2008) because it 
deals with a population of halos in the mass range correspond- 
ing to the Milky Way progenitors. The stellar mass fraction 
of isolated halos is thus 
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:0.32 



M„ 



1O 12 M 
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(7) 



This relation steepens at low masses because of two additional 
limits on the gas and stellar fractions, which we impose to 
constrain the range of equations (|5H7| to be physical. 

First, the sum of the gas and stars ("cold baryons") cannot 
exceed the total amount of accreted baryons in a halo: 



f*~^~ fg — fin 



(8) 



At each redshift, there is a transition mass M/,. co id, below 
which /*+/# = fin and above which f*+f g < f n - For masses 
M < M/,.coid (but not too low, see next paragraph), we set 
/g,revised = fin~f*, with /* still given by equation (|7]). We con- 
sider the baryons that are not included in f g or /* to be in the 
warm-hot diffuse phase of the interstellar medium. 

Second, the ratio of stars to cold baryons, /i* = /*/(/* +f g ), 
is not allowed to increase with decreasing halo mass. For mas- 
sive halos (M n > M^ co id) /U* monotonically decreases with 
decreasing mass because of the condition (O. At some in- 
termediate masses M/,. M < M„ < Mucoid., M* continues to de- 
crease but the gas fraction is reduced by the condition ([8). At 
Mh < Mh,^, would reverse this trend and increase with de- 
creasing halo mass because the cold gas is almost completely 
depleted. Such a reversal is unlikely to happen in real galax- 
ies, which would not be able to convert most of their cold gas 
into stars. Therefore, for all masses M„ < M„ ifM we fix /x* to 
be equal to the minimum value reached at M/,. M . This affects 
both /* and f g . 

We expect our stellar mass prescription to apply in the range 
of halo masses from 10 9 to 1O 12 M , at least for the Local 
Group. However, this relation breaks when a halo becomes 
a satellite of a larger system. Satellite halos often have dark 
matter in the outer parts stripped by tidal forces of the host, 
while the stars remain intact in the inner parts. Unless the 
satellite is completely disrupted, we keep its stellar mass fixed 
at the value it had at the time of accretion, even though the 
halo mass may subsequently decrease. 

The simultaneous effects of the above scaling relations are 
difficult to understand as equations. Figure[T]illustrates graph- 
ically the values of the gas and stellar fractions used in our 
model at various cosmic times. At z = the gas fraction 
peaks for halos with Mi~3x 10 10 M . At lower masses 
it is reduced by the amount of accreted baryons (eq. |8), 
while at higher masses it is reduced by the gas-to-stars ra- 
tio (eq. [5]). The stellar fraction follows equation (0 at high 
masses but drops faster at low masses because of the con- 
straint ([8]). For a Galaxy-mass halo, M/, sa 1O 12 M , our model 
givesM* sa 5.5 x 1O 1O M andM g w9x 1O 9 M . These num- 
bers are consistent with the observed amount of the disk and 
bulg e stars and the atomic and molecular gas in the Galaxy, 
from lBinney & Tremaind (120081) . 

At earlier epochs at all masses of interest, the gas fraction 
is higher and the stellar fraction is lower. There is a range 
of halos withM/, > 10 10 M , which have an almost 100% gas 
fraction at redshifts z > 3. Such halos should be most efficient 
at forming massive star clusters. 

We realize that our adopted relations for the evolution of 
the stellar and gas mass are not unique, as we are basing each 
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Figure 1. Gas mass fraction (solid tines) and stellar mass fraction (dotted 
lines) vs. halo mass in our model at redshifts z = 0, 1,2,3,4,5,6. Stellar 
fractions monotonically increase with time while gas fractions decrease with 
time. The kink in the curves is due to our restriction on the maximum stellar 
fraction via eq. HQ. 



fit on two data points. In order to test the sensitivity of our re- 
sults to these assumptions, we con sider alternative functional 
forms for these fits in Section 14.51 In particular, we give the 
stellar fraction a steeper dependence on halo mass and weaker 
dependence on cosmic time: 



/*,ait = 0.32 



M h 



10 12 M 







0.5 



(i+zr 



(9) 



Such a slower evolution of th e stellar mas s is con s istent with 
the ob serva tional studies of | Borch et al . (2006), Be ll et al] 
d2007lh and lDahlen et al.l ( 12007b . The corresponding gas and 
stellar fractions are shown in Figure [2] Note that the amount 
of cold gas available for cluster formation is not strongly af- 
fected by this change (compare Figs.[T]and|2]). 

2.2. Rate of Cluster Formation 

Having fixed the parametrization of the available cold gas, 
we then relate the gas mass of a protogalaxy to the combined 
mass of all globular clusters it can form within ~ 10 8 yr (the 
timescale of the simulation output). We based it on the rate 
derived in lKravtsov & Gnedinl d2005l) : 



Mi 



cc ] 



= 3x10^(1+,,)$^- 



(10) 







An additional factor, 1 + p%, allows us to boost the rate of clus- 
ter formation. Such a boost may be needed because we form 
new clusters only at arbitrarily chosen epochs corresponding 
to the simulation outputs. Unresolved mergers between the 
outputs may require p2 > 0. In our model we find the best 
fit to the Galactic metallicity distribution for pi ~ 3 (see Ta- 
ble!]}. 

Note also that equation ( flOb imposes the minimum mass 
of a halo capable of forming a globular cluster. Based on 
dynamical disruption arguments (Section [3) we track only 




0.01 - 



M h (M ) 

Figure 2. Same as Fig. [T] but for an alternative stellar mass prescription, 
eq. CD. 



clusters more massive than M m \„ = 10 M . Since we al- 
ways have M g < fbMh, in order to form even a single clus- 
ter with the minimum mass, the halo needs to be more mas- 
sive than 10 9 M Q . For gas-rich systems at high redshift, 
M cc ~ 10" 4 Mg/f b ~ lO~ 4 M h . 

Given the combined mass of all clusters to be formed in an 
event, M^c, our procedure for assigning masses to individual 
clusters is as follows. We first draw the most massive cluster, 
which we call the nuclear star cluster, even though we do not 
have or use the information about its actual location within the 
host galaxy and it is not important for our current study. The 
mass assigned to the nuclear cluster, M max , is derived from the 
assumed initial cluster mass function, dN/dM = MqM~ 2 : 



1 = 



dN 
d~M^ 



dM, 



(11) 



which gives M max = Mq. This normalization is constrained by 
the integral cluster mass: 



M cc = 



A/„, 



M mm 



dN M m . dx 
M—rrzdM = M mdx In - 



dM 



Mr 



(12) 



The power-law initial mass function agrees both with the ob- 
servations of young star clusters and the hydrodynamic sim- 
ulations. After the nuclear cluster is drawn, the masses of 
smaller clusters are selected by drawing a random number 
< r < 1 and inverting the cumulative distribution: r = N(< 
M)/N(< M max ): 



M- 



l-r(l-M min /M n 



(13) 



We continue generating clusters until the sum of their masses 
reaches M cc . 

The formation of clusters is triggered by a gas-rich major 
merger of galaxies, which includes mergers of satellite halos 
onto the main halo as well as satellite-satellite mergers. New 
clusters form when the halo mass at z'-th simulation output 
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Table 1 

Fiducial Model Parameters 



Parameter 


Value 


Effect 


Oniet 


0.1 


log-normal dispersion of mass-metallicity relation 


Pi 


3.0 


boost of the rate of cluster formation 


Pi 


0.2 


minimum merger ratio 


P4 


0.04 


minimum cold gas fraction for case-1 formation 


P5 


0.98 


minimum cold gas fraction for case-2 formation 



exceeds the mass at the previous output by a certain factor, 
and at the same time the cold gas fraction exceeds a threshold 
value: 

case-1: M hA > (1 +p 3 ) M/,.,_i and f g > p 4 . (14) 

Also, we require that the maximum circular velocity does not 
decrease in this time step, to ensure that the mass increase was 
real rather than a problem with halo identification. We have 
experimented with a more relaxed criterion for the main halo 
than for satellite halos, with /?3. m ain < /?3,sat, but did not find a 
significantly better fit to the mass or metallicity distributions. 
We therefore keep a single value of for all halos. 

For some model realizations, we consider an optional al- 
ternative channel for cluster formation without a detected 
merger, if the cold gas fraction is very high: 



"2: fg>Ps, 



(15) 



where the threshold p$ is expected to be close to 100%. 
This channel allows continuous cluster formation at high red- 
shift when the galaxies are extremely gas-rich. High-redshift 
galaxies are probably in a continuous state of major and mi- 
nor merging, but because of their lower masses it is more dif- 
ficult to detect such mergers in the simulation. Additional 
motivation for this channel follows from some nearby star- 
burst galaxies that are forming young massive clusters despite 
appearing isolated. Case-2 formation is allowed only for 
isolated halos before they are accreted into larger systems and 
become satellites. The epoch of accretion is defined by the 
last timestep before the orbit of the subhalo falls permanently 
within the virial radius of its host. 

Our model sample combines clusters formed in the main 
halo and in its satellites, either surviving or disrupted. We 
exclude clusters from the satellites that have a galactocentric 
distance at z = greater than 150 kpc, which is the largest dis- 
tance of a Galactic globular cluster. We apply the criteria for 
cluster formation at every timestep of the simulation (every 
~ 10 s yr) for each of the three main halos and their satellite 
populations. The rate of cluster formation per every merger 
event is therefore approximately Mqq /10 8 yr. 

In order to compare the distribution of clusters obtained 
from our analysis to the distribution of Galactic globular clus- 
ters, we normalize the total number of model clusters by the 
ratio of the Galaxy mass to the simulated halo masses at z = 0: 



N, 



normalized ' 



" N model 



M h i+M h2 +M hi 



(16) 



We take M w = 1O 12 M and use M hl = 2.37 x 1O 12 M , 
M h2 = 1.77 x 10 12 M ^, and M h3 = 1.70 x 1O 12 M from 
Krav tsov et all d2004l) . 



2.3. Metallicity 

The iron abundance is assigned to each model cluster ac- 
cording to the estimated average metallicity of its host galaxy. 
The latter we obtain from the mass-metallicity relation for 
dwarf galaxies o f the Local Group at z = as formulated by 
IWoo etal.ld2008h : 



[Fe/H] = -1.8 + 0.41og 



10 6 M 



(17) 



In fact, the same fit is valid fo r the smallest, ultrafaint dwarfs 
studied by Kirb v et all (12008). Thus we apply this relation to 
all protogalactic systems in our simulation volume. 

We also include the evolution of this relation with cos- 
mic time, based on the available observations of Lyman-break 
galaxies at z ~ 2 (lErb et al . 2006), Gemini Deep Deep Survey 
galaxies at z ~ 1 (Savagl io et al.ll2005l) . and cosmological hy- 
drodynamic simulations that p rovide the averag e metallicity 
of galaxies dBrooks et al.ll2007HDave et al.ll2007l) : 



[Fe/H](f)«[Fe/H] -0.03 



to-t 
10 9 yr 



(18) 



While this temporal evolution is probably real, it can change 
the metallicity for the same stellar mass by at most 0.36 dex 
in 12 Gyr. This amount is smaller than the 0.4 dex change 
of [Fe/H] due to the stellar mass variation by a factor of 10. 
In our model, globular clusters form in protogalaxies with a 
range of stellar masses of several orders of magnitude (see 
Fig.|9]below). 

The observed mass-metallicity relation for a large sample 
of galaxies observed by the SPSS has an i ntrinsic scatter of 
at least 0.1 dex (e.g., Tremonti et al. 2004). We account for 
it, as well as for possible observational errors, by adding a 
Gaussian scatter to our calculated [Fe/H] abundances with a 
standard deviation of er met = 0.1 dex. The exact value of this 
dispersion is not important and can go up to 0.2 dex without 
affecting the results significantly. 

Using e quat ions ( fTTl ) and ( fT8b along with the procedures 
of Section 12.21 we can generate a population of star clusters 
with the corresponding masses and metallicities. The model 
contains two random factors: the scatter of metallicity and 
the individual cluster masses assigned via equation dl3D . We 
sample these random factors by creating 1 1 realizations of the 
model with different random seeds. Each realization com- 
bines clusters in all three main halos. Taking into account 
that the halos are about twice as massive as the Milky Way, 
the expected number of clusters in each model realization is 
~ 150(the observed number) x (2.37+ 1.77+ 1.7) « 870. The 
total set of all 1 1 realizations includes ~ 9500 clusters. For 
the purpose of conducting statistical tests on the distributions 
of cluster mass and metallicity, we consider each realization 
separately and then take the median value of the calculated 
statistic. 

For convenience, we provide a list of the most important 
equations we used in the model in table |2] 

3. DYNAMICAL DISRUPTION 

Star clusters are prone to gradual loss of stars, and in some 
cases, total disruption by internal and external processes. It 
is expected that the mass function of globular clusters has 
evolved through cosmic time, from an initial (probably, a 
power law) distribution to the approximately log-normal dis- 
tribution that is observed today. Since the main focus of this 
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Table 2 

Summary of Model Equations 



Equation Section Description 



2.1 
2.1 
2.1 
2.1 
2.2 
2.3 

ET 



fraction of baryons accreted onto a halo 

cutoff mass for baryon accretion 

cold gas mass relative to stellar mass 

stellar mass relative to halo mass 

mass in globular clusters relative to gas mass 

stellar mass-metallicity relation for halos 

evolution of cluster mass 



paper is on the observable properties of the Galactic popula- 
tion, we evolve all of our model clusters dynamically from 
their time of formation until the present epoch. We adopt the 
evaporation via two-body relaxation and stellar evolution as 
the mechanisms for mass loss. Tidal shocks are ignored for 
simplicity. Cluster mass changes because of the decrease of 
the number of stars, N*(t), by evaporation and the decrease of 
the average stellar mass, m(t ), by stellar evolution: 



1 dM 1 dN* 1 dm , s , s m(0) 

— -J- = — —T- + = -77 = -^ev(M) - I/ se (f ) . 

M dt N* dt m dt m 



(19) 



We have assumed, as done in the recent literature, that the 
evaporation rate depends only on cluster mass. The time t for 
each cluster is measured from th e moment of its formatio n. 

We adopt the calculation of iPrieto & Gnedin! d2008l) for 
the time-dependent mass-loss rate due to stellar evolution, 
VsJ}) (see me ii" Fig- 7). That calculation uses the re- 
lation between star's initial mass and remnant mass from 
IChernoff & Weinberg! (1 1 9901) and the main-sequence lifetimes 
from lHurlev et al.l (12000). Overtime, ste llar evol u tion r educes 
the cluster mass by up to 40%, for a iKroupal ([2001) IMF. 
This implies that no clusters are disrupted by stellar evolution 
alone, and the net effect is only a shift in the mass distribution 
towards the lower end. 

We now need to derive the evaporation rate, u ev (M), as a 
function only of cluster mass. We beg in by writing down the 
standard approximation ( Spitz erl 1 9871) using the half-mass re- 
laxation time, t r h'. 



& 7.25 iJhG 1 ! 1 In A 



(20) 



where £ e is the fraction of stars that escape per relaxation time, 
/?h is the half-mass radius, and In A is the Coulomb logarithm. 
We take m = O.87M for a Kroupa IMF, and l nA= 12, whic h 
is a common value used for globular clusters (Spitzer 1987). 

We then assume that at the time of formation Rh depends 
only on cluster mass, as oc M s °, and not on the position 
in the host galaxy. As a fiducial model, we use a constant 
density model where i5o = 1/3 ((Kravtsov & Gnedin! 120051 : 
IPrieto & Gn edin 2008). The relation for the initial size is nor- 
malized with respect to the median observed mass of Galactic 
clusters, 2 x 10 5 M©, and their median size of 2.4 pc: 



fl h (0) = 2.4 pc 



M(0) 



2 x 10 5 M, 



So 



(21) 



A similar relation extends to other dynamically hot stellar 
syst ems: nuclear star clusters and ultracompact dwarf galax- 
ies (IKissler-Pat ig et alj 120061) . The mass-size relation may 
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Figure 3. Dynamically evolved clusters at z = in the fiducial model with 
£ c = 0.033, 5 = <5o = 1/3 (solid histogram), compared to the observed distri- 
bution of Galactic globular clusters (dashed histogram). Dotted histogram 
shows the combined initial masses of model clusters formed at all epochs, 
including those that did not survive until the present. In the model we do not 
follow clusters with the initial masses below 10 5 M Q . 



change over the course of the cluster evolution. We consider 
a power-law relation with a potentially different slope, so that 
the half-mass radius responds to changes in the cluster mass 
as 



R h (t) _ (M(t)Y 
R h (0) \M(0)J 



(22) 



Our preferred value is again 5 = 1/3, but we also discuss re- 
sults for other choices of Sq and S. Recent A-b ody models 
of cluster disruption are consistent with S s» 1/3 dTrenti et alj 
120071: iHurlev et aLl20 08). Note that cluster sizes are only used 
as an intermediate step in the derivation of v ew (M) and can be 
subsequently ignored. The evaporation time thus becomes 



10 10 yr 



0.033 



M(0) 



2x 10 5 M^ 



(M(t)Y 
\M(0)J 



(23) 

The fraction £ e is not well constrained. The lower limit 
on is achieved in isolated cluster s, for which £ e = 0.0074 
( Ambartsumian 1938t ISpitzerll 1946T) . Tidally-truncate d clus- 
ters lo se st ars at a faster rate, as firs t calculated by iHenonl 
(1196 11) and lSpitzer & Chevalier! (119731) . Using orb it-averaged 
Fokke r-Planck models of cluster evolution, Gne din et alj 
(1999) found varying between 0.02 and 0.08 depending 
on time and cluster concentration (their Fig. 4 and Table 
2). More re cently, realistic d i rect jV-body models becam e 
possible (e.g., Baumgardt 2001; Baumgardt & Makino 2003). 
These calculations revealed that the gradual escape of stars 
through the tidal boundary, which is not spherical as in the 
Fokker-Planck calculations, breaks the li near scaling of the 
disruption time with the relaxation time. Baumgardt] d2001l) 

suggested that the evapora tion time scales as v~l oc fj 5 / 4 . 
Gieles & Baumgardt d2008l) verified this relation and found 
almost no dependence on the cluster half-mass radius. In- 
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Figure 4. The mass function of clusters in the fiducial model at different 
epochs corresponding to the cosmic times of 1 Gyr (z. ~ 5.7, dotted), 2 Gyr 
(z ~ 3.2, dotted), 5 Gyr (z ~ 1.3, dashed), 9 Gyr (z ~ 0.5, dot-dashed), and 
13.5 Gyr (z = 0, solid). 



stead, they proposed an explicit dependence on the Galacto- 
centric distance Re and velocity Vg, to reflect the strength of 
the local tidal field: i/~ v oc uT 1 = Rg/Vg- This gives oc 
M 3 / 4 w _1 . Their formula is similar to the emp irical estimates 
of the disruption time by lLamers et al.l (120051) : oc M 65 . 

Since the calculation of the local tidal field is currently be- 
yond our simple model, we ignore the dependence on the 
Galactocentr ic distance but argue that w e can incorporate 
the result of lGieles~& Baumgardt (2008) for the disruption 
timescale by using a lower value of 5q = S = 1 /9. With this 
choice of the exponents, our equation ( 1231 gives vzl oc M 2 I\ 



We discuss these alternative mod els in S ection |4.6l 
For consistency with Pr ieto & Gnedinl (120081) . we adopt £ e = 

0.033 for the fiducial model. 
With the above ingredients, we can now compute the cluster 

mass at time t after formation by inverting is ev (M) in equation 

( [19] ) and assuming that most of the stellar evolution mass loss 

happens much faster than the evaporation: 



M(t ) = M(0) 



£ (t')dt' 



1+3(5 



2/(1+35) 



(24) 

where i/ ev , = v ev (M = M(0)). 

The initial mass function of globular clusters is evolved 
from the time of formation until the present epoch and is 
shown in Figure[3]for the fiducial model. The observed mass 
function in the Milky Way is well represented by a log-normal 
distribution. We derive the masses of the Galactic c lusters 
by tak ing their absolute V-band magnitudes from the lHarrisI 
(1996) catalog and assuming a constant mass-to-light ratio 
M/Lv = 3Mo/Lq. The functional form of a Gaussian built 
around logM for the observed sample is given by 
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Figure 5. Final mass of model clusters vs. their initial mass, for a single 
realization of the fiducial model. Clusters are divided into three age groups: 
triangles represent old clusters (age > 10 Gyr), squares represent intermediate 
age clusters (5 Gyr < age < 10 Gyr), and circles represent young clusters 
(age < 5 Gyr). All disrupted clusters are placed at the bottom of the plot, to 
illustrate the range of their initial mass. The birthline of clusters, M = M(0), 
is plotted as a dashed line for reference. 



(25) 



with the mean logM = 5.22 and standard deviation ctm = 0.61, 
in solar masses. The predicted mass function in the fiducial 
model with £ e = 0.033 and S = So = 1/3 is consistent with the 
observations. The Kolmogorov-Smirnov (KS) test probabil- 
ity of the two mass functions being drawn from the same dis- 
tribution is Pks.m = 7.4%. This value is the median of the 
KS probabilities for the 1 1 random realizations of the model. 
The model distribution is also well fit by a Gaussian, with 
logM = 5.14 and a M = 0.65. 

The mean of the model distribution is slightly lower than 
observed, implying that the disruption process needs to be 
stronger to fully reconcile with the data. Clusters that start out 
with low mass but are not disrupted effectively over their life- 
time over-populate the low end of the present-day model mass 
function. Old and intermediate-age clusters that started with 
initial mass 5 < logM < 5.4 and survived until the present era 
appear to be the main cause of this discrepancy. 

Figure [4] illustrates the evolution of the mass function over 
cosmic time as an interplay between the continuous buildup 
of massive clusters (M > 10 5 M Q ) and the dynamical erosion 
of low-mass clusters (M < 1O 5 M ). Since we do not track 
the formation of clusters below M m \ n , the low end of the mass 
function was built by a gradual evaporation of more massive 
clusters. The strongest bout of cluster formation happens be- 
tween a cosmic time of 1 and 5 Gyr, and a peak of the mass 
function forms at M ~ 3 x 10 4 M Q . The peak moves to larger 
masses, ~ 10 5 M Q by t = 9 Gyr, while a power-law tail devel- 
ops at low masses. A significant fraction of low-mass clusters 
is disrupted between 9 Gyr and the present, as few new clus- 
ters are produced. 

The relation between the cluster initial and final masses is 
shown in Figure Old clusters that have undergone sig- 
nificant amounts of dynamical and stellar evolution form a 
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Figure 6. Metallicities of model clusters formed at all epochs that have sur- 
vived dynamical disruption by z = in the fiducial model (solid histogram), 
compared to the observed distribution of Galactic globular clusters (dashed 
histogram). 

tight sequence on this plot. The lower boundary with a 
dense concentration of points corresponds to the expression 
M = 0.63[M(0)-2.6x 10 5 M Q ], which reflects 13Gyr of stel- 
lar and dynamical evolution according to equation (l24l i with 
the fiducial values of the parameters. Thus an old cluster must 
have an initial mass of at least 2.6 x 10 5 M Q to survive dy- 
namical disruption. Clusters in the younger age groups fill the 
space between their birthline and this boundary. The youngest 
clusters have the shallowest slope at low mass, as few of them 
have had enough time to undergo significant disruption. The 
mean final mass for all three age groups is about the same, 
implying that some of the oldest globular clusters could have 
been more massive at the time of their formation than clusters 
th at have formed recen tly in the local universe. 

iFall & Zhangl (12001b suggested that a low-mass end of the 
mass function should approach dN/dM ss const as a result 
of dominant disruption by two-body evaporation. Our mass 
function in the range 3.5 < log(M/M Q ) < 5.0 is consistent 
with a power law log(dN/dlogM) = 0.891ogM - 3.04, or 
dN/dM oc M~°' , in good agreement with the expectation. 

4. RESULTS 

4.1. Exploration of the Parameter Space 

Overall, our model has five adjustable parameters (Table[T|i. 
To explore possible degeneracies among these parameters, 
and to find the parameter set that produces the best-fitting 
metallicity distribution, we set up a grid of models in which 
each of the parameters was varied within a finite range of val- 
ues. The range was taken to be large enough to explore all 
physically relevant values of each parameter. 

The boost for cluster formation, p2, varied from to 5. For 
consistenc y with the rate derived in the hydrodynamic sim- 
ulation of iRravtsov & Gnedin! ( 12005b . we aimed to keep this 
parameter at low values. 

The minimum mass ratio for mergers, pi, varied between 
0. 15 and 0.5. It is consistent with typical major merger criteria 
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Figure 7. Metallicity distribution in the fiducial model, split by the forma- 
tion criterion: major mergers (case-1) and early mergers (case-2). Solid 
histograms show the clusters formed in the main Galactic halo. 



used in the literature (e.g. lBeaslev et alj|2002l use pT, = 0.3). 

The cold gas fraction required for cluster formation during 
a merger, p\, could be relatively low but non-zero, so that we 
considered < p4 < 0.2. This threshold parameter accounts 
for why disk galaxies like the Milky Way are still forming 
stars despite a low gas fraction, while ellipticals are not. 

The gas fraction for case-2, p$, has to be very high 
- above 90%, as our prescription predicts that many ha- 
los have a very high gas fraction at high redshift and could 
over-produce blue globular clusters (as was the case in the 
iBeasley et a02002l model). 

We considered several values for a met but found that a value 
of 0.2 or higher smeared out the peaks in the metallicity dis- 
tribution, while a value of failed to fill the extreme ends of 
the distribution. We therefore include only three values in our 
search, cr met = 0,0.1,0.2. 

We find the best-fit model by searching through the multi- 
parameter space and maximizing the KS probabilities of the 
metallicity distribution, Pks.z, and the mass function, Pksm^ 
being consistent with observations. The likelihood function 
also contains additional factors that force the parameters to- 
wards the values that we consider ideal. We require the model 
to produce the observed number of clusters, N « 150, scaled 
by the host galaxy mass as in equation dl6t . We wish to max- 
imize the fraction of clusters formed in the main disk, /disk, 
to be consistent with the observed spatial distribution (Sec- 
tion |5]l. We penalize the likelihood function for large values 
of p2 and for any young clusters formed after t = 10 Gyr, 
A^afterio- We also wish to minimize the fraction of clusters 
formed through the case-2 channel, f case 2, for simplicity 
of the model. Finally, we want to increase the likelihood of 
the metallicity distribution being bimodal, as characterized by 



the Dip test, P&p, which we discuss later in Section \4~4\ The 
actual likelihood function that we maximize is given by 

log C = log Pks,z + . 3 log P K sm - [(jV-150)/30] 2 +log/ disk 
-0.15p 2 -20AWrio/iV-0.4/ case2 +3 logP dip . (26) 
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Figure 8. Age-metallicity relation in all 1 1 realizations of the fiducial model 
(fa 9500 clusters). The build-up of massive halos drives the steep slope of this 
relation at early epochs. Outer histograms show marginalized distributions on 
linear scale. Notice an order-of-magnitude spread in metallicities of clusters 
forming at a given epoch. 
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Figure 9. Number of clusters in the fiducial model as a function of the envi- 
ronment: redshift of formation (top left panel), present age (top right panel), 
host halo mass at the time of formation (bottom left panel), and host stellar 
mass (bottom right panel). 



The coefficients for each term were adjusted heuristically un- 
til we found that their relative weights matched our expecta- 
tion to select acceptable distributions. The "best-fit" distribu- 
tion that maximizes C is therefore a subjective fiducial model 
that we use to illustrate how the bimodality may arise. We 
then look at how many model realizations are similar to the 
"best-fit" for other possible values of the parameters. 

4.2. Age and Metallicity Distributions 

Figure [6] shows the predicted best-fit metallicity distribu- 
tion of model clusters and the observed distribution of Galac- 
tic globular clusters, both metal-poor and metal-rich. Note 
that we require our model to have the same formation criteria 
for both cluster populations; we do not explicitly differentiate 
between the two modes. The only variable is the gradually 
changing amount of cold gas available for star formation. Yet, 
the model predicts two peaks of the metallicity distribution, 
centered on [Fe/H] = -1.54 and [Fe/H] = -0.58, in remark- 
able agreement with the observations. The standard deviation 
of the red peak is 0.24 dex and of the blue peak is 0.32 dex. 

The probability of KS test of the model and data samples 
being drawn from the same distribution is Pks,z = 80%, that 
is, they are fully consistent with each other. The number 
of surviving clusters is N = 147, also matching the observa- 
tions. Even though our current model is extremely simple, 
this bimodality is reproduced naturally, without explicit as- 
sumptions about truncation of the production of metal-poor 
clusters at some early epoch or about the formation of metal- 
rich clusters in a merger of two spiral galaxies. 

We find that the main halo contributes more significantly to 
the red peak than it does to the blue peak (Figure 0. In par- 
ticular, clusters with the highest [Fe /H] appear to have been 
formed primarily by late merging into the main halo. 

The fraction of clusters formed via case -2 channel is 
/case2 = 22%. These clusters produce a single-peaked distri- 



bution of blue clusters. In contrast, clusters formed in major 
mergers contribute to both red and blue modes, in about equal 
proportions. We return to this point in the discussion of glob- 
ular cluster systems of elliptical galaxies in Section|7] 

Clusters that formed after z = 2 constitute the bulk of the 
red peak and contribute little to the blue peak in the metallic- 
ity distribution (Figure [8}. The strength of this result implies 
that the gas reservoir and the rate of hierarchical merging at 
intermediate redshifts is conducive to the creation of red clus- 
te rs. This result lends itself w ell to the idea that the simulation 
of iKravtsov & Gnedinl (120051) was only able to reproduce the 
metal-poor population of globular clusters because the simu- 
lation was stopped at z ~ 3. 

Our prescription links cluster metallicity to the average 
galaxy metallicity in a one-to-one relation, albeit with random 
scatter. Since the average galaxy metallicity grows monoton- 
ically with time, clusters forming later have on the average 
higher metallicity. The model thus encodes an age-metallicity 
relation, in the sense that metal-rich clusters are younger by 
several Gyr than their metal-poor counterparts. This relation 
is required in the model to reproduce the observed metallicity 
distribution, because very old galaxies cannot produce high 
enough metallicities. However, Figure [8] shows that clusters 
of the same age may differ in metallicity by as much as a fac- 
tor of 10, as they formed in the progenitors of different mass. 

Available observations of the Galactic globular clus- 
ters do not show a clear age-metallicity relation, but in- 
stead indicate an age spread increasing with metallicity 
dDe Angeli et al.l2005tlM arin-Fran ch et al.l2009t iDotter et all 
2010; Forbes & Bridges 2010). Red clusters have younger 
mean age overall and may be as young as t s» 7 Gyr. Our 
model does not appear to be in an obvious conflict with this 
trend. We define cluster age as r = to — tf, where tf is the 
time of formation. We find the mean age of 1 1.7 Gyr for the 
blue population and 6.4 Gyr for the red population, with the 
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Figure 10. Ratios of total cluster mass and number to the galaxy stellar mass, 
summed over all systems that are located within 150 kpc of the center at z - 0. 
Lines represent the average over all three host halos and their corresponding 
subhalo populations. Shading represents the scatter in these ratios, given by 
the lowest and highest values among the three hosts. Massive star cluster 
formation was a much more dominant component of galactic star formation 
at early times than it has been for the last 10 Gyr. 



standard deviation of 1.3 Gyr and 2.7 Gyr, respectively. More 
accurate dating of the Galactic and extragalactic clusters is 
needed to falsify the predicted age-metallicity trend. 

Distributions of the cluster formation time and environment 
in the fiducial model are shown in Figure [9] The age distri- 
bution, which peaks strongly between 1 1 and 1 3 Gyr, demon- 
strates that the majority of our clusters is still very old and 
falls in line with the observed perception of globular clus- 
ters. However, the distribution of formation redshift appears 
remarkably flat in the range 1 < z < 7, emphasizing that the 
clusters were not formed in a single event but rather through 
the continuous process of galaxy formation. Few clusters 
were formed prior to the era of reionization, as sufficiently 
large quantities of gas could not be condensed to meet the 
mass threshold for cluster formation at redshifts z > 9. The 
distributions of the total and stellar mass of the host galaxies 
extend over three orders of magnitude. Their extended high- 
mass tails contribute to the strength of the red peak, as the 
most massive halos would form most metal-rich clusters. 

Globular clusters form much earlier than the majority of 
field stars. Figure[l0]shows the fraction of galaxy stellar mass 
locked in massive star clusters, normalized for convenience 
as 10 3 Mcc/M*. To calculate this ratio, we summed over all 
protogalactic systems that would end up within 150 kpc of the 
galaxy center at z = 0, regardless of their location at earlier 
times. Thus it represents a global cluster formation efficiency 
in a Milky Way-sized environment. Specific realizations of 
the model differ in detail in the three host halos, by as much 
as a factor of 2. This scatter is shown by the shaded region on 
the plot. The globular cluster mass includes their continuous 
formation and the mass loss due to the dynamical evolution. 
A striking prediction of the model is a very high cluster frac- 
tion at early times, near t = 1 Gyr, of M cc /M f w 10-20%. 



Star cluster production may have been a dominant compo- 
nent of galactic star formation at z > 3. By t = 3 Gyr (z ~ 2), 
the cluster fraction drops to only a few percent, as expected 
for a galaxy undergoing active star formation. At the current 
epoch, massive star clusters make up less than 0.1% of the 
stellar mass. The predicted ratio is progressively more un- 
certain at higher redshift because it relies on our extrapolated 
prescription for the galactic stellar mass. The low-redshift 
prediction should be robust. We also show a variant of the 
specific frequency parameter related to the number of clusters , 
T = A7(M*/10 9 M Q ), introduced bv lZepf & Ashmanl (119931) . 
It shows a similar decline with time, reaching T w 2 at the 
present. 

These global cluster formation ef ficiencies agree with many 
observations across galaxy types. iRhode et alj d2005l) find 
T ~ 1 for both red and blue clusters in the field and group spi- 
ral galaxies. This parame ter increase s with the galaxy mass. 
In the Virgo cluster. lPeng et al.1 (120081) find T ~ 5 for galaxies 
in the mass range appropriate for the Milky Way. McLaughlin 
( 1999) estimated the cluster mass fraction in both spiral and 
elliptical galaxies to be M GC /(M*+M g ) « 0.0026 ± 0.0005. 
This is larger than what we find by a factor of several, but 
we count in M» all stars out to 150 kpc, which includes some 
satellite galaxies as well as the host. Therefore, both predicted 
cluster efficiencies at z = are reasonable. Their rise at high 
redshift is an interesting prediction of the model. 

The model also shows that the globular cluster system over- 
all is more metal-poor than the stars in disrupted satellites, 
which are expected to form a stellar spheroid of the Galaxy. 
We calculated the mass-weighted metallicity of stars formed 
in the di srup ted satellites of all three main halos (using eqs. 
Ul E H3 n~8b . This calculation bears all the uncertainty of our 
extrapolated time evolution of the stellar fraction and mass- 
metallicity relation, but nevertheless provides a useful esti- 
mate. We find the tail of halo star metallicities as low as the 
most metal-poor globular clusters, but the overall stellar dis- 
tribution peaks around [Fe/H] w — 0.3. A ver y similar situa - 
tion is observed in NGC 5 128 and discussed by H arris! (I2010I) . 
In our model, majority of globular clusters form before the 
bulk of field stars and therefore acquire lower metallicities. 
For comparison, the metallicity of stars in surviving satellite 
galaxies peaks around [Fe/H] m -0.8 and forms an interme- 
diate population between the clusters and the field. 

Despite our attempts to incorporate it as a major penalty in 
the likelihood statistic, we were unable to completely elimi- 
nate the phenomenon of young massive star clusters. Inter- 
estingly, these clusters did not originate in the main galac- 
tic disks. All clusters younger than 5 Gyr formed in satel- 
lite halos in the mass range ~ 10 10 — 10 Mq, at distances 
40-100 kpc from the center. Although the proper sample 
of the Galactic globular clusters does not contain any young 
clusters, there are several young massive clusters in M31 
whose ages were confirmed both from the v isual and UV 
colors dFusi Pecci et all 120051: [Rev et all 120071) and from the 
integrated-light spectroscopy dPuzia et al.ll2.005h . The actual 
analogs of young model clusters may be found in the LMC, 
which hosts globular clusters with a wide range of ages and 
continues to form clusters now. There may even exist young 
star clusters with masses ~ 10 5 M Q in the Galactic disk, hid- 
den behind tens of visual magnitudes of extinction but reveal- 
ing them selves through free-free e mission of their ionization 
bubbles dMurray & Rahmanl l2010). Massive star cluster for- 
mation at late times thus paints a picture consistent with the 



Metallicity Distribution of Globular Clusters 



11 



40 



30 



Z 20 



10 



5[7 



model 
observed 



-2 -1 

[Fe/H] 

Figure 11. Metallicity distribution at z = in the model without case-2 for- 
mation (solid histogram), compared to the observed distribution of Galactic 
globular clusters (dashed histogram). 
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Figure 12. Contour plot of the KS probability for the metallicity distributions 
in the plane of parameters P2~Pi- Contour labels are the actual probability 
values, Pks.z- This plot shows that KS test alone cannot rule out any region 
of the parameter space from being statistically consistent with the data. 



idea that today's super star clusters are destined to become 
ob servationally equ i valent to g lobular clusters, as envi sioned 
bv lAshman &Zepl (fl992h and lHarris & Pudritzl dl994h . 

A separate criterion for the formation of clusters in ex- 
tremely gas-rich systems (case-2) is not necessary for 
achieving a good fit to the observed metallicity distribution. 
Though we feel that the inclusion of case-2 formation chan- 
nel in the model is both useful and physically motivated, it 
takes away from the elegance of using only resolved merg- 
ers as a lone formation mechanism. It turns out that the main 
benefit of allowing clusters to form via case-2 is seen in 
the mass function of surviving clusters. The high-mass end of 
the mass function matches the observations better if massive 
halos (primarily the main halo) are allowed to form as many 
clusters as possible at early times. 

We searched the model grid without case-2, by setting 
PS = 1, and found an almost equally good metallicity distribu- 
tion as in the fiducial model. Figure Q~T] shows that this distri- 
bution also appears bimodal and completely consistent with 
the data. The KS probability is Pks.z = 92%. In fact, even the 
mass function is only marginally less consistent, Pksm = 2.0% 
vs. 7.4% in the fiducial model. The parameters used to obtain 
this distribution were: p2 = 2.85, pi = 0.16, p\ = 0.04, p$ = 1, 
Cmet = 0-1. 

4.3. Sensitivity to Model Parameters 

The fiducial distribution discussed above is not unique 
among our results in its ability to match the observations. Sig- 
nificant degeneracy exists among combinations of the model 
parameters that produce metallicity distributions consistent 
with the Galactic sample. Many models within the grid have 
sufficiently high KS probabilities. In this section we explore 
which regions of the parameter space produce models similar 
to our best fit. 

First, let us motivate the use of the likelihood function given 
by equation (|2o*T i as opposed to using a standard statistical test 



to select the best fit. In the early stages of development of 
our model, we relied on KS test alone to help us understand 
the range of parameters that produce metallicity distributions 
that match the observations. However, once the model was 
completed, it became apparent that KS test alone was not 
powerful enough for analysis of the results. This is clearly 
demonstrated in Figure [T2J which shows the value of the KS 
probability Pks.z as a function of pi and pi across their re- 
spective ranges in the grid. Each point represents the maxi- 
mum possible Pks.z for the given values of p2 and p^ with the 
other parameters free to vary within the grid. This is done to 
best represent the full extent of the 5 -dimensional parameter 
space within a 2-dimensional slice. Statistically, any distribu- 
tion with Pks,z > 10% cannot be ruled out with confidence, 
implying that almost the entire range of our parameters can 
produce statistically consistent distributions! In addition, al- 
though some regions of the parameter space have higher val- 
ues of Pks,z than others, there is no clear pattern in the con- 
tours to help us understand the required physics of star cluster 
formation within our semi-analytical recipe. 

In comparison, Figure [13] shows contours of the value of 
the likelihood function from equation d26l l. using the same 
scheme described above to maximize the value at each point. 
The shape of these contours demonstrates that pi and p^, are 
degenerate in their ability to produce good distributions. The 
degeneracy can most easily be understood by noting that these 
parameters directly affect the total number of clusters: p2 con- 
trols the cluster formation rate per merger, while p^ selects el- 
igible mergers. It is therefore expected that the contours show 
a correlation at high levels of the likelihood function, as the 
statistic depends sensitively on the total number of clusters. 

Figure [T4l shows the same type of contours as Fig. [13] but 
only for distributions with ps = 1. Disallowing the case-2 
channel reduces the range of the parameter space where good 
distributions are found. In particular, compared to the previ- 
ous plot, Fig.fPfllacks any viable models with pi > 0.3. Given 
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Figure 13. Contour plot of the likelihood statistic C in the plane of parame- 
ters P2~Pi - Contour labels show percentages of the maximum. The highest- 
value region is a degeneracy along the line p2 = 19p3 — 0.91. 
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Figure 14. Same as Figure [TSI but for the models without case-2. The 
highest- value region is a degeneracy along the line p2 = 24p^ — 1 . 1 . 

the tight and steep correlation in this plot, it is likely that larger 
values of p^ would require very high p2 > 5, which may vio- 
late current observational constraints on the cluster formation 
efficiency. However, Fig. [TT1 demonstrates that a good model 
can still be found with reasonably small values of pi and p^,, 
without the case-2 channel. 

To understand the sensitivity of the likelihood function to 
individual parameters, we also considered one-dimensional 
slices of the parameter space around the fiducial model, this 
time allowing for only one parameter to vary at a time. Fig- 
ure [15] illustrates how the sharp peaks of C allow us to se- 
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Figure IS. Marginalized single-parameter likelihood distributions around 
the fiducial model, £/£ max (solid lines). Dashed lines show the metallic- 
ity probability PksZ normalized to the fiducial model value. Compared with 
Pks.z alone, the likelihood function C significantly tightens the constraints 
on the best values of the parameters. Filled circles show the fiducial model. 



lect the best model more accurately than on the basis of Pks.z 
alone. Particularly as a function of p2 and pi, Pks.z varies 
slowly over the entire range of the grid. On the other hand, p\ 
and ps must stay within a small range of their fiducial values 
in order to achieve acceptable values of either Pks.z or C. 

Figure [16] shows variation of the metallicity distribution 
when individual parameters deviate from their fiducial val- 
ues. Each parameter can change the shape of the metallicity 
function and the number of clusters. The effects of varying p2 
and p^ are almost opposite, reflecting the degeneracy in the 
likelihood contours. In particular, smaller pj accommodates 
more minor mergers, which allow massive hosts form more 
metal-rich clusters as well as some metal-poor clusters. De- 
creasing ps allows more clusters form through the case-2 
channel; most of such clusters are metal-poor. The major role 
of p4 appears to govern the extent of the most metal-rich clus- 
ters - lower threshold gas fraction allows clusters to form in 
the later, more enriched environments of massive hosts. 

Figure [17] illustrates the response of the metallicity distri- 
bution to simultaneous variations of model parameters. First, 
we plot two distributions where we changed p^ to 0.15 and 
0.3 while keeping the other parameters fixed. The width of 
the metal-poor peak broadens as pi is lowered, indicating that 
a wider range of halos in the early universe were able to pro- 
duce clusters. Raising pi has the opposite effect. Note that 
the locations of the two peaks are remarkably robust to these 
changes. Staying at p^ = 0.15, we set ps = 1 to eliminate the 
case-2 channel and set p\ = to allow even gas-poor mas- 
sive halos at low redshift to form clusters. The result (long- 
dashed line) is a distribution with a much broader metal-rich 
peak, which extends well past the maximum metallicity of the 
fiducial model. The dot-dashed line represents a correspond- 
ing change to p4 = 0.08 and p$ = 0.96 for the pj, = 0.3 model. 
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Figure 16. The effects of varying individual model parameters on the metal- 
licity distribution. In each panel a single parameter is increased (dotted line) 
and decreased (dashed line) relative to the fiducial model (solid line). The 
parameter values are indicated inside the panels. 



In this case, the metal-rich peak is severely depleted and re- 
mains only as an extended tail of a single-peaked, metal-poor 
distribution dominated by case-2 clusters. These distribu- 
tions are just some of the realizations of our model that were 
rejected due to their low values of C. All of them have fea- 
tures that conflict with the observed metallicity distribution in 
the Galaxy. 

4.4. Origin of the Metallicity Bimodality 

The KS statistic measures the overall consistency of the 
model and observed metallicity distributions, but not specifi- 
cally bimodality or multimodality within the distributions. In 
order to address the particular issue of modality, we employ 
two additional statistical tests, described in Appendix. 

The Gaussian Mixture Modeling test indicates that the fidu- 
cial distribution is bimodal at a high level of significance (bet- 
ter than 0.1%). The peak metallicities of both modes and their 
widths are close to the observed values and agree with them 
within the errors. Both samples easily appear bimodal to the 
eye because the modes are well separated, with the dimen- 
sionless peak separation ratio D > 3 (see eq. [A3). However, 
as we discuss in Appendix, the GMM test is sensitive to the 
assumption of Gaussian modes. It may indicate highly statis- 
tically significant split into two modes when the distribution is 
truly unimodal but skewed. For faster and more robust model 
selection we consider another test of multimodality. 

The Dip test compares the cumulative input distribution 
with the best-fitting unimodal distribution. The maximum dis- 
tance between the two corresponds to a dip in the differential 
distribution. The Dip test for the observed Galactic clusters 
indicates that the distribution is 90% likely to not be uni- 
modal. When applied to our fiducial model, the Dip test im- 
plies it is 99% likely to not be unimodal. However, there is a 
caveat that the probability of the Dip test depends on the num- 
ber of objects in the sample, similarly to KS test. The higher 
significance of the model result does not mean that the model 




Figure 17. The effects of simultaneous variation of several model param- 
eters. The fiducial model (solid line) is plotted alongside four distributions 
that illustrate other outcomes of our model. Short-dashed and dotted lines 
correspond to respectively lowering (to 0.15) and raising (to 0.3) away 
from its fiducial value (0.2). Long-dashed line corresponds to the model with 
P3 = 0.15, pn = 0, ps = 1. Dot-dashed line corresponds to the model with 
p 3 =0.3, P4 = 0.08, p 5 = 0.96. 



is actually more bimodal than the data, because we used all 
1 1 random realizations of the model as a combined sample to 
evaluate the Dip test. While this is not a fair comparison to the 
data, it allows us to differentiate efficiently among alternative 
models. 

We ran the Dip test for all models on the grid in a man- 
ner similar to the likelihood statistic. The most interesting 
result of the Dip statistic comes from one-dimensional slices 
of the parameter space. Considering only models with the 
normalized number of clusters in the range 140 < N < 160, 
we binned the distributions according to the values of the four 
parameters and found the median and quartiles of Pdip in each 
bin. Figure [18] shows several trends, (i) Distributions with 
low formation rate pi are unlikely to be bimodal. The 75th 
percentile of Pdip increases systematically with p2 in the range 
2 < P2 < 3, but plateaus for p2 > 3. (ii) The most bimodal dis- 
tributions require p$ to be small enough to allow for merger 
ratios 1:5 to trigger cluster formation. Between p$ = 0.2 and 
Pi = 0.5, the lower pi the better. However, mass ratios lower 
than 1:6 may dilute bimodality. (iii) The gas fraction thresh- 
old p4 should be under 10% for ideal bimodality, to include 
mergers of massive galaxies, (iv) The fraction ps has to be 
close to 1, implying that case-2 negatively affects bimodal- 
ity. A conclusion from this plot is that bimodality appears in 
a significant number of model realizations, for a wide range 
of parameters. At the same time, a similarly large number of 
realizations are unimodal. 

The metallicity distribution is bimodal if metal-rich clus- 
ters constitute a significant subset of all clusters. Thus, the 
fraction of red clusters, / re( j, is a simple proxy for bimodality. 
Indeed, we find a strong correlation between / re d and Pdip- The 
red fraction follows similar, but weaker, trends with model pa- 
rameters to those shown in Fig. [18] The median red fraction 
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ters split by the range of merger mass ratios of their formation event: 0.2 < 
Figure 18. Median values of the Dip probability (triangles) among distribu- AM k /M h < 0.3 (dots), 0.3 < AM h /M h < 0.5 (dashes), and AM h /M h > 0.5 

tions with 140 < N < 160, binned according to each parameter. Dashed lines (soM Une) xhe similarity of all three distributions implies that metallicity 

extend to the 75% quartiles of P dip . The fiducial model is shown by a red dot. bimodality is not caused by mergers with any particular range of mass ratios. 

The observed distribution is plotted in dot-dashes for comparison. 



correlates most strongly with p$, increasing from / le d = 16% 
for ps = 0.96 to / le d = 32% for p$= 1. The red peak is signifi- 
cantly stronger without case-2 clusters. 

We note that the Dip test, unlike KS test, does not depend 
on comparing the model distribution to the Galactic sample. 
Therefore, the trends for bimodality derived from the Dip test 
should apply to other globular cluster systems. We anticipate 
that bimodality would likely arise if we applied our model 
with the parametric constraints stated above to any cosmolog- 
ical A^-body simulation that follows the mass assembly history 
of a large galaxy. Further discussion of applying our model in 
different galactic environments follows in Section[7] 

In order to investigate the underlying cause of bimodal- 
ity, we examined various properties of merger events. A 
merger event is defined as any time in a halo's track when 
it meets the criteria for case-1 formation. An important 
requirement here is the minimum mass of cold gas needed 
to produce a cluster that would survive dynamical disruption. 
Through equation ( [Tol l, a cluster mass M > 2 x 1O 5 M re- 
quires M g >3x 10 8 Mq. This constraint significantly reduces 
the number of eligible mergers. We considered the distribu- 
tions of halo mass, lookback time, and metallicity (without 
additional dispersion) for all relevant merger events. We find 
that relatively few mergers happen in the space of high metal- 
licity, high mass, and late time. Almost half of the merg- 
ers (44%) take place before r = 12 Gyr, and only 24% of 
the mergers happen in the last 10 Gyr. If we also counted 
the events that led to now-disrupted clusters, these numbers 
would spread even further to 53% and 17%, respectively. 
Nevertheless, the recent mergers stand out for two reasons: 
each such event creates more clusters, and these younger clus- 
ters have better chance of surviving the dynamical disruption 
than the older clusters. Since the number of clusters formed in 
each merger is positively correlated with the galaxy mass, the 
few stochastic super-massive mergers with high metallicity 



are likely to produce a significant number of clusters, which 
would separate the red peak from the blue peak. 

We also considered that cluster bimodality may be linked 
to the mass ratios in the merger events. "Major" and "mi- 
nor" mergers have been proposed to play different roles in 
galaxy formation, so it is conceivable that different types of 
star clusters may be formed depending on the merger ra- 
tio. In Figure [19] we plot cumulative metallicity distribu- 
tions for clusters grouped according to the mass ratios in 
the cluster-forming merger event. Mergers with the closest 
masses, AMh/Mh > 0.5, contribute 48% of case-1 clusters, 
while the lower two mass ranges each contribute equal por- 
tions of the rest. Running KS test on these distributions re- 
vealed that they formally represent statistically different pop- 
ulations. However, there is no clear-cut range of metallicities 
where one type of merging is exclusively producing all of the 
clusters, and the overall shapes of the distributions are simi- 
lar. This uniformity suggests that bimodality is a natural con- 
sequence of hierarchical cluster formation regardless of the 
exact definition of a "major" merger. 

Figure [20] shows how many models from the grid fall 
into particular ranges of the Dip probability and the ratio of 
case-2 clusters to case-1 clusters, N2/N1. The mod- 
els are restricted to have the normalized number of clusters 
140 < Nj +N2 < 160. The region with the highest density of 
models is in the lower-right corner of the plot, correspond- 
ing to high Pdip and low N2/N1. Low values of Pdip are not 
significant since they cannot reject a unimodal distribution. 
Effectively, bimodality requires Nz/N\ < 0.5. At the signifi- 
cance level of Pdip = 90%, corresponding to the observed dis- 
tribution, 38% of the grid models are bimodal if N2/N1 < 0.3. 
This fraction drops to only 15%for0.3 <A^/M < 1, and then 
further to 9% for N2/N1 > 1 . These statistics confirms that bi- 
modal populations appear only when the case-2 channel is 
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Figure 20. Number of models resulting in particular values of the Dip prob- Figure 21. Same as Figure fT5] but for the models with an alternative pre- 

ability and the ratio of case-2 to case-1 clusters, for all realizations of scription forM* as a function ofM/,, given by equation 
the parameter grid with the normalized number of clusters 140 < N < 160. 



a secondary formation mechanism. 

Another part of the explanation of bimodality of the sur- 
viving clusters is due to the dynamical evolution. Most of the 
disrupted clusters were old and blue. If we add these disrupted 
clusters to the metallicity histogram in the fiducial model, the 
blue peak rises by a factor of 2. The red peak remains vir- 
tually unaffected, since the more recently formed red clusters 
are less subjected to dynamical disruption. We ran the Dip 
test on all grid distributions, including both surviving and dis- 
rupted clusters. Among the models with the number of sur- 
viving clusters in the range 100 <N < 200, few distributions 
have Pdip > 50% and none has Pdip > 80%. This means vir- 
tually no bimodality. Indeed, the distributions appear almost 
entirely unimodal, with the peak in the blue end and nothing 
more than a tail in the red end. This leads to a prediction that 
late-type galaxies, which have more continuous cluster for- 
mation than early-type galaxies, may be less likely to exhibit 
bimodal cluster populations. 

4.5. Alternative Formation Prescriptions 

As alluded to in Section [2] some equations that we used in 
the prescriptions for the stellar mass and the cold gas fraction 
were based on only a few observed points. Currently there 
is limited observational or theoretical understanding of how 
these functions should behave at high redshift, which is the 
period of primary interest for our study. Below we consider 
some alternatives for these prescriptions. 

The stellar fraction that we adopted from lWoo etafl (120081) 
is well motivated by Milky Way dwarf galaxies at the present 
ep och, but the abundance-ma tching models such as the one 
by iConroy & Wechslerl (|2009) predict a steeper dependence 
on halo mass in the range 1O 8 M < M» < 10 I0 M Q . Addi- 
tionally, the redshift dependence of this relation is uncertain, 
and recent observational s urvey s dBorch et alj20 06: Bell etail 
l2007t iDahlen et al.l l2007) have advocated a slower evolution 
than (1 +z)~ 2 adopted in our model. To accommodate this un- 
certainty, we re-ran the entire parameter grid with equation 



(0 instead of equation (0. The corresponding contour plot 
is shown in Figure |2U and the best-fit metallicity distribution 
is shown in Figure [22] This best fit is capable of reproducing 
the observed metallicity (Pks.z = 49%) and mass distributions 
(Pksm - 9.5%), similar to our fiducial model. The accept- 
able range of the parameter space is narrower and shifted to- 
wards higher values of p2, as the steeper mass slope otherwise 
prevents low-mass halos from forming a sufficient number of 
clusters. Nevertheless, this alternative prescription still leads 
to a significant chance of a bimodal metallicity distribution. 

Current observational constraints on the gas fraction at 
high redshift are even more uncertain. We adjusted the fit 
by altering the scale mass in equation (O. As alternatives 
to the fiducial model, we considered a power law of red- 
shift, M s =M s .o(l+z) 2 , and an inverse time dependence, 
M s = M s x>(t/to)~ l . Both of these relations resulted in lower 
gas fractions during the high-redshift epoch when most glob- 
ular clusters should form. The gas fractions were too low for 
case-2 formation for any reasonable choice of ps . More im- 
portantly, the low gas masses did not allow even the most mas- 
sive halos to form star clusters until intermediate redshifts. 
Therefore, none of these fits is a viable alternative to the fidu- 
cial model. 

The same problems manifested if we took the simplistic ap- 
proach of holding the gas fraction constant for all halo masses 
at all times. This idea was initially considered to see if we 
could generate simple results based only on halo merger histo- 
ries without speculation on the baryonic physics. We quickly 
realized that this approach was not going to work. Setting f g 
too low effectively prevents cluster formation at high redshift, 
when blue globular clusters are expected to form, as most ha- 
los cannot build up sufficient mass to overcome the minimum 
mass requi red to form a single massive star cluster (discussed 
in Section l2~2b . A constant gas fraction that is too high, on 
the other hand, presents obvious unphysical predictions at low 
redshift, and in particular would drastically over-predict the 
number of young clusters, forcing us to arbitrarily truncate 
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Figure 22. Metallicity distribution at z = in the best-fit model with an alter- 
native prescription for M* as a function of M;,, given by equation (solid 
histogram), compared to the observed distribution of Galactic globular clus- 
ters (dashed histogram). 



their formation. 

We also considered a n alternative p a ramet rization of the gas 
fraction, suggested by IStewart et al.l (120091) . They took the 
same observational constraints as us, but fitted them as 



^=0.04f \ 
M* V4.5xlO n M 







-0.59(l+z)° 



(27) 



This formula predicts so much cold gas at high redshift that 
many low-mass halos would be able to form clusters via 
case- 2 channel for any ps < 1. If we completely disable 
case- 2 formation and use the above prescription for the 
gas mass, we find many model realizations consistent with 
the observed metallicity distribution. This prescription differs 
from our fiducial choice in that it produces considerably more 
young clusters and achieves less clear metallicity bimodal- 
ity. The maximum value of the likelihood function attainable 
with this prescription is approximately half of the value for 
the fiducial prescription. Nevertheless, it could still produce 
acceptable globular cluster results. 

In addition to changing the formulation of the fits, we in- 
vestigated the effect of adding a random Gaussian dispersion 
with standard deviation crfl ts to the right hand sides of equa- 
tions (0), ©, and (O to reflect their intrinsic scatter as well 
as observational uncertainty. Different random values for the 
three scatters are generated for each halo at each timestep, but 
we always force the condition ([8]) on the total baryon con- 
tent. For simplicity, we used the same magnitude of af\ ts for 
the scatter added to all three equations simultaneously. We 
re-ran the parameter search grid using cr nts = 0.1, 0.2, and 0.3 
dex. For each value of <7 nts , we were still able to find mod- 
els with high values of Pks.z and overall likelihood statistic, 
although these values decline with the increasing amount of 
scatter. The metallicity probability varies from 49% to 16% 
to 6%, for (jfits = 0.1,0.2,0.3 dex respectively. 

As an alternative to scatter in the cutoff mass (eq. [3]) with 



a fixed functional form for the gas fraction (eq. |2), we tried 
adding scatter to equation (f2j) while keeping fixed equation 
(0). Adding scatter to f m allows the gas fraction to ex- 
ceed the threshold p$ much more easily and produce too 
many case-2 clusters. To avoid unphysical results, we an- 
alyze only results for case-1 formation. In this case we 
find the best-fit models with P KS .z = 47%, 13%, and 5%, for 
(Tfits = 0.1,0.2,0.3 dex respectively. These models are still 
consistent with the observed Galactic distribution. 

The addition of scatter as described above has two system- 
atic effects on any individual realization of the metallicity dis- 
tribution: the high metallicity tail is extended even further and 
the height of the blue peak is damped relative to the case of 
no scatter. The former effect is due to the possibility of draw- 
ing higher values of M* and hence higher [Fe/H]. The lat- 
ter effect arises from the enforcement of equation ||8), which 
prevents gas-rich halos at high-redshift from gaining any ex- 
tra gas from the positive scatter in equation (|5); on the other 
hand, negative scatter can prevent some of these halos from 
being eligible for case-2 formation. Accordingly, the best 
distributions with higher values of crgts were found for models 
with low values of p$. We note that the Dip probability in 
most realizations is not strongly affected by the new scatter, 
implying that the actual smearing of the peaks is not signifi- 
cant and bimodality is preserved. 

4.6. Alternative Dynamical Disruption 

In Section [3] we noted that the expression for the evapora- 
tion rate (eq. 11231 ) contains some inherent parameters. Here 
we explore alternative disruption models with different values 
of £ e and 5 within the fiducial formation prescription. 

The effect of decreasing £ c is simply to reduce the number 
of clusters that are completely disrupted by z = 0. In the fidu- 
cial model with £ e = 0.033, about 60% of the original sample 
is disrupted. (Note that this implies that roughly 5 x 10 7 M Q 
worth of stars in the Galactic stellar halo could be remnants 
of the disrupted clusters.) With the factor of £ c = 0.02, only 
~ 30% are disrupted. With the factor of £ e = 0.01, almost all 
clusters survive. 

The effect of decreasing i5 and 5q is to shift the peak of the 
mass function to a lower mass. 

We repeated the grid parameter search for the best metal- 
licity distribution for two alternative prescriptions, one with 
& = 0.02, 5 = Sq = 1/3, and the other with £ e = 0.033, S = 5 = 
1 /9. We found that in both cases our model could produce an 
observationally-consistent metallicity distribution. However, 
lowering either £ e or S significantly alters the mass function 
away from the data, by allowing too many low-mass clusters 
to survive. Raising £ e and 5 may improve the mass function, 
but steers away from recent constraints on the two parame - 
ters dBaumgardt & Makinol2003l:lGieles & Baumgardtl2008l) . 
Therefore, we ultimately conclude that our fiducial prescrip- 
tion (£„ = 0.033, S = 8 = 1 /3) works best. 

For illustration, we list below the properties of the best 
models in the two alternative prescriptions. For £ e = 0.02, 
S = Sq = 1/3, we find a peak model that has a metallic- 
ity distribution with Pks,z = 69% and mass distribution with 
Pksm = 0.02%. The parameters of this model are p2 = 4.4, 
p 3 = 1.4, p 4 = 0,p 5 = 0.99. 

For £ e = 0.033, S = So = 1 /9, we find a peak model that has 
a metallicity distribution with Pks.z = 19% and mass distri- 
bution with Pks.m = 12%. The parameters of this model are 
P2 = 4, p3 = 1.4, p4 = 0, ps = 0.97. Its metallicity distribu- 
tion is shown in Figure [23] The overabundance of metal-poor 
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Figure 23. Metallicity distribution at z = in the best-fit model with an al- 
ternative dynamical disruption prescription, S = Sq = 1/9 (solid histogram), 
compared to the observed distribution of Galactic globular clusters (dashed 
histogram). 



clusters is clear. 

The latter alternative prescription (5= 1/9) predicts the dis- 
ruption tim e to scale with cluste r mass in a manner nearly 
identical to Gieles & Baumgardt (2008), if we take that all 
model clusters have a mean galactocentric frequency u = 
(2.4 x 10 7 yr) _1 . Based on the orbit calculations discussed 
in the next section, we find a similar median value of uj for 
the sample of model clusters in the fiducial model. We also 
find no correlation between ui and cluster mass, which con- 
firms our assertion in Section [3] that in such a model the dis- 
ruption time of the average cluster would scale with mass as 



:M 2 / 3 . 



5. SPATIAL DISTRIBUTION 



Using the same jV- body simulation as in this paper, 
iPrieto & Un edin (2008) investigated the present spatial distri- 
bution of model clusters that formed in high-redshift (metal- 
poor) galactic systems. They calculated the orbits of clusters 
from the time when their host galaxies accreted onto the main 
galaxy and identified three distinct populations. Disk clus- 
ters formed in the most massive halo that eventually hosts the 
present Galactic disk. These clusters, found within the inner 
10 kpc, are scattered into eccentric orbits by the perturbations 
from accreted galactic satellites. Inner halo clusters, found 
between 10 and 60 kpc, came from the now-disrupted satellite 
galaxies. Their orbits are inclined with respect to the Galactic 
disk and are fairly isotropic. Outer halo clusters, beyond 60 
kpc from the center, are either still associated with the surviv- 
ing satellite galaxies, or were scattered away from their hosts 
during close encounters with other satellites and consequently 
appear isolated. 

The azimuthally-averaged space density of metal-poor 
globular clusters is consistent with a power law, n(r) oc r" 7 , 
with the slope 7 w 2.7. Since all of the distant clusters origi- 
nate in progenitor galaxies and share similar orbits with their 



hosts, the distribution of the clusters is almost identical to that 
of the surviving satellite halos. This power law is similar to 
the observed slope of the metal-poor globular clusters in the 
Galaxy. However, the model clusters have a more extended 
spatial distribution (larger median distance) than observed. In 
the model it is largely determined by the orbits of the progen- 
itor galaxies and the epoch of formation. iMoore et al.[ (2006) 
showed that the early-forming halos are more spatially con- 
centrated and in order to match the Galactic distribution, glob- 
ular clusters would need to form at z ~ 12. However, such an 
early formation may be inconsistent with the requirement of 
high mass and density of the parent molecular clouds. 

In this work, we have retraced some of these steps to at- 
tempt to reproduce the spatial distribution of the whole Galac- 
tic globular cluster system. 

The clusters that formed in the disk of the main halo are 
assigned radial positions according to the exponential pro- 
file, dN/dR oc Re~ R l Rj , with the observed scale length of 
the Galactic disk, R4 = 3 kpc. The azimuthal angles are as- 
signed randomly. The vertical position in the disk is also 
assigned randomly, with the scale-height of one fifth of the 
scale-length. The clusters are limited to the radial range 
0.6 < R < 10 kpc, where the observed disk globular clusters 
are located. The distances are also given a random Gaussian 
scatter of 10% to replicate observational distance uncertain- 
ties. 

Clusters that formed in satellite halos that survived until z = 
are assigned the present position of the host, with a small 
displacement analogous to the distribution in the main disk. 
Clusters that formed in subhalos that did not survive until z = 
are initially assigned the last known position and velocity 
of the host in the simulation, with the same displacement as 
above. We then follow the orbits of these stray clusters until 
z = using a leap-frog integrati on scheme with fixed tim e step. 

The orbit integration follows Prieto & Gnedin (2008). The 
main halo and the satellite halos contribute their NFW po- 
tentials, while the disks within the halos contribute the 
Miyamoto-Nagai potentials with the total mass of gas and 
stars computed from equations (O and (0. The total grav- 
itational potential is computed by linearly interpolating the 
masses of halos and subhalos between the simulation snap- 
shots. Positions of subhalos at each timestep are computed 
with cubic splines between the snapshots. We also include 
the acceleration on the clusters that result s from the use of the 
splines, as described in lPrieto & Gnedirj (120081) . Cosmolog- 
ical dark energy contributes an additional component to the 
acceleration in physical coordinates: aA = fl\H^r. 

Just as in the previous study, we find a more extended spa- 
tial distribution of the globular cluster system than that ob- 
served in the Galaxy. Clusters that formed in surviving satel- 
lites (about 24% of the sample in the fiducial model) are the 
most distant from the center, as forced by the location of the 
satellites. The orbit integration for the clusters formed in dis- 
rupted satellites (about 52% of the sample) shows that these 
clusters also do not migrate in r far from the last known po- 
sition of their host. Such coupling to the dark matter halos is 
the main reason for the overextended cluster system. 

Clusters that formed in the disk of the main halo (the re- 
maining 24% of the sample) most closely resemble the spa- 
tial properties of the Galactic clusters. They are confined to 
the inner 10 kpc and would be referred to as the bulge or disk 
clusters. However, this group should contain more than 50% 
of the sample to be consistent with observations. Recent paper 
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bv lGriffen et al.l (120101) similarly investigated the formation of 
red clusters by major mergers in the Aquarius simulation and 
concluded that such clusters must have formed in the central 
disk. 

Note that our o rbit calculations, as well as those by 
Griff en et al.l (IMTot) . use the gravitational potential derived 
in collisionless cosmological simulations. Stars and cold gas 
would deepen the gravitational potential in the inner regions 
of the main halo and bring the satellites closer to the cen- 
ter. Dense stellar nuclei of the satellites should also sur- 
vive against tidal disruption longer t han pure dark mat ter ha- 
los. Hydrodynamic simulations of iNaab et al.l ((2009) show 
that the combined effect of baryons may be to deposit half 
of stellar remnants of the disrupted satellites, including their 
globular clusters, within 10 kpc of the center. This would ef- 
fectively reconcile the predicted cluster distribution with the 
Galactic sample, since over 50% of our clusters formed in dis- 
rupted satellites. More detailed hydrodynamic simulations of 
galaxy formation are needed to verify either hypothesis. 

An observational test of the cluster orbits would be possible 
when proper motions are measured for a large fraction of the 
Galactic clusters. Such measurements could be achieved with 
the planned SIM-Lite space observatory. 

6. GLOBULAR CLUSTER COLORS 

We attempted another direct comparison of the model pre- 
dictions with the observed sa mple, by constructing singl e stel- 
lar population models using iBruzual & Chariot] (120031) code 
GALAXEV. As input for GALAXEV, we used the age and 
metallicity for each globular cluster created with the fiducial 
model. 

The distribution of the model B - V color was considerably 
less bimodal than the metallicity distribution discussed in pre- 
vious sections. The main cause of the smearing of the two 
peaks appears to be the younger age of metal-rich clusters 
predicted by the model. The metal-poor clusters constitute a 
clearly defined peak at B-V = 0.67, which corresponds well 
to the blue peak of the Galactic sample. But the metal-rich 
clusters, which are expected to make up the red peak, have 
a mean B-V color of 0.77, while the observed red mean is 
close to 0.85. The standard deviation of the red model clus- 
ters is 0.08, implying that the result is consistent within one 
sigma of the observed, but the bimodality of the distribution 
is not evident to the eye. 

To test the hypothesis that the smearing of the color peaks 
compared was due to the relative age of the populations, we 
ran the population synthesis models again, this time using a 
constant age of 12.1 Gyr for all clusters. The resulting dis- 
tribution indeed appeared to constitute two peaks, with a blue 
peak at a mean of B-V = 0.67 and a red peak at B-V = 0.84, 
with a clearly defined gap between them. It should be noted 
that a known discrepancy exists between the B-V colors pre- 
dicted by all major popula tion synthesis codes an d those of 
observed globular clusters (IConroy & Gunnll2010l) . All mod- 
els predict colors that are too blue at high metallicity, which 
would directly play into smearing bimodality in our result. 

In addition to the colors, we examined the cluster luminosi- 
ties (absolute V-band magnitudes) calculated by GALAXEV. 
These allow a more direct comparison with the observations 
than the mass function presented in Section [3] for which we 
were required to assume a constant mass-to-light ratio for 
all observed clusters. This constant MjLy ratio has been a 
traditional appro ach, but has come under recent scrutiny by 
Kruiissen (2008) who argued that M/Ly may vary as a func- 



tion of cluster mass. However, the distribution of V-band 
magnitudes for our fiducial model has a KS probability of 
4.3%, which is not a significant departure from the 7.4% for 
the mass function. When we tried the same exercise as above 
by setting the ages of all model clusters to 12.1 Gyr, the KS 
probability jumped to 25%. This improvement likely hap- 
pened because we converted the magnitudes of observed clus- 
ters into masses using M/Ly = 3, while GALAXEV typically 
predicted M/Ly < 3 for the 12.1 Gyrisochrones. This brought 
the mean luminosity of model clusters closer to the observed 
value than the average mass of model clusters was to its ob- 
served counterpart. 

Even though these population synthesis results are interest- 
ing, we believe that the mass and metallicity distributions pre- 
sented in previous sections are more reliable. Population syn- 
thesis modeling adds an extra layer of empirical uncertainty 
to our results, as the specific nature of horizontal-branch evo- 
lution remains an issue that has not been completely resolved. 

7. SUMMARY AND IMPLICATIONS FOR GALAXY 
FORMATION MODELS 

We have presented a model for the origin of the metallicity 
distribution of globular clusters. In our scenario, bimodality 
results from the combination of the history of galaxy assem- 
bly (rate of mergers) and the amount of cold gas in protogalac- 
tic systems. Early mergers are frequent but involve relatively 
low-mass protogalaxies, which produce preferentially blue 
clusters. Late mergers are infrequent but typically involve 
more massive galaxies. As the number of clusters formed in 
each merger increases with the progenitor mass, just a few 
late massive mergers can produce a significant number of red 
clusters. The concurrent growth of the average metallicity of 
galaxies between the late mergers leads to an apparent "gap" 
between the red and blue clusters. 

The peak metallicities of the red and blue populations are 
remarkably robust to variations of the model parameters. The 
peaks encode the mass-metallicity relation in galaxies and do 
not depend strongly on the rate or timing of cluster formation. 
The exact definition of a major merger is also not important 
for our result, as long as the merger mass ratio is at least 1:5. 

Our conclusions on the origin of metallicity bimodality are 
not significantly affected by the large uncertainties in our 
knowledge of the stellar mass and cold gas mass in high- 
redshift galaxies. We considered alternative prescriptions for 
the stellar fraction, gas-to-stars ratio, and even dynamical dis- 
ruption, but in all cases found a metallicity distribution con- 
sistent with the observations. Such robustness indicates that 
most external factors are not as important as the internal mass- 
metallicity relation in host galaxies. 

We find that dynamical disruption over the cosmic history 
naturally converts an initial power-law cluster mass function 
into an observed log-normal distribution. A continuous for- 
mation of clusters in the first several Gyr help to replenish the 
depleted low-mass end. Dynamical disruption also helps es- 
tablish metallicity bimodality by preferentially depleting old 
clusters in the metal-poor peak. 

Our prescription links cluster metallicity to the average 
galaxy metallicity in a one-to-one relation, albeit with random 
scatter. Since the average galaxy metallicity grows monoton- 
ically with time, the cluster metallicity also grows with time. 
Our model thus encodes an age-metallicity relation, in the 
sense that metal-rich clusters are somewhat younger than their 
metal-poor counterparts. Observations of the Galactic globu- 
lar clusters indicate an age spread that ranges from 1 Gyr for 
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the inner blue clusters to 2 Gyr for the inner red clusters to 6 
Gyr for the outer clusters, which is generally consistent with 
the predicted spread. However, the model may be marginally 
inconsistent with the observation that some of the metal-rich 
clusters appear as old as the metal-poor ones. Note that our 
model is still simplistic and does not include metallicity gra- 
dients within protogalaxies, which may dilute the predicted 
age-metallicity relation. 

Our model demonstrates that star cluster formation during 
gas-rich mergers of protogalactic systems is a single mech- 
anism that successfully reproduces many observed proper- 
ties of the Galactic globular clusters. It may avoid the need 
for two separate formation mec hanisms for the red a nd blue 
clusters invoked in the model of iBeaslev et al.l (|2002). Their 
model relied on constant cluster formation efficiency relative 
to the field stars, but required different efficiencies for the 
two modes. While the red clusters in their models contin- 
ued forming through galaxy mergers, the blue clusters needed 
to be arbitrarily shut off at z = 5. This was the main cause of 
the bimodal metallicity distribution in their model, as the blue 
clusters did not have much overlap with the red clusters that 
formed in major mergers involving metal-en riched gas con- 
siderably after z = 5. The Bea sley et al.l (J2002) model also ne- 
glected the effects of the dynamical evolution that shaped the 
present cluster distribution. In contrast, in our model some 
old blue clusters are disrupted and some are unable to form at 
recent times because the protogalaxies are gas-poor. Another 
difference is that in our model major mergers contribute both 
red and blue clusters, while in Beasley et al. they contribute 
only red clusters. We also find that globular clusters form sig- 
nificantly earlier than the bulk of field stars and therefore the 
two cannot be linked by a constant formation efficiency at all 
times (see Fig. [Tot. 

We have compared the metallicity distribution of globu- 
lar clusters to the mass-weighted metallicity distributions of 
other stellar populations as predicted by our scaling relations 
given in Section 2. We find that galaxy field stars overall 
have a single-peaked distribution with a mean of [Fe /H] ps 0, 
a metal poor tail, and no stars with [Fe/H] > 0.4. This is 
consistent with our current understanding of the metallicity 
of stars in the Galactic disk. The stars in surviving satellites, 
which correspond to Milky Way dwarf galaxies, also appear 
to have a single-peaked distribution with a mean metallicity 
[Fe/H] fa — 1. Only the globular cluster system display a bi- 
modal metallicity distribution. 

We derived some simple scaling relations for the overall 
efficiency of globular cluster formation. We adopted the clus- 
ter formation rate in gas-rich, high-redshift merger events 
(eq. [Tol l that scales with the host system mass as Mqq ~ 
lO~ 4 M g /f b ~ \Q~ 4 M h . We have later learned of a similar 
empirical relatio n for all types of massiv e g alaxies, derived 
indepe ndently by Spitler & Forbes (2009) and Georgie v et al.l 
(2010). The outcome of the model is a prediction that the frac- 
tion of galaxy stellar mass locked in star clusters, Mcc/M^, 
is of the order 10-20% at z > 3 and then declines steadily 
with time to about 0.1% at present. The specific frequency 
parameter follows a similar decline with time and reaches 
iV/(M*/10 M©) ~ 1 at the present. These efficien c ies ar e 
in agreement with the compilations of McLaug hlin! (119991) . 



iRhode et al.l d2005l) . and lPeng etafl ( 120081) . We also find that 
the globular cluster system overall is significantly more metal- 
poor than the galactic spheroid, which is populated by stars 
from the disrupted satellites. 

Our scenario can be applied to other galactic environments, 
such as those of elliptical galaxies which c ontain much larger 
samples of globular clusters. For example, Peng et al.l (120081) 
showed that the fraction of red clusters increases from 10% 
to 50% with increasing luminosity of elliptical galaxies in the 
Virgo cluster. In our model, globular cluster formation is en- 
tirely merger-driven. We showed that the Galactic sample 
may have arisen from early super-gas-rich low-mass merg- 
ers and later metal-rich high-mass mergers. Compared to the 
Galaxy, giant ellipticals are expected to experience more high- 
mass mergers which would contribute more prominently to 
the globular cluster system. As Figure [7] shows, such mergers 
would produce comparable numbers of red and blue clusters 
simultaneously. Thus the fraction of red clusters should in- 
crease with galaxy mass, reaching ~ 50% f or giant ellipticals. 
This trend, observed bv lPeng et alj (120081) . may be a natural 
outcome of the hierarchical formation. 

At the other end of the galactic spectrum, dwarf galaxies 
likely lacked metal-rich mergers and produced only metal- 
poor blue clusters. In particular, dE and dSph type dwarfs 
which are now deprived of cold gas are not expected to con- 
tain any young and metal-rich clusters. Some dlrr galaxies, 
such as the LMC, still possess considerable amounts of cold 
gas and may produce younger clusters, although they are still 
likely to have subsolar metallicity. The variety of globular 
cluster ages observed in the LMC indicates that it may have 
had bursts of star formation throughout its cosmic evolution. 

Our study places interesting constraints on galaxy forma- 
tion models. Within the framework of our model, acceptable 
mass and metallicity distributions result only from a certain 
range of the parameters. In particular, the minimum ratio of 
masses of merging protogalaxies strongly correlates with the 
cluster formation rate. If the clusters form very efficiently 
only a few massive mergers are needed; if the clusters form 
inefficiently many mergers are needed, which requires a lower 
merger threshold. However, mass ratios of less than 0.2 are 
disfavored in the model (see Fig. [13). Formation of mas- 
sive clusters in very gas-rich systems without detected merg- 
ers (our case -2 scenario) improves the final mass function 
but is not required for reproducing the metallicity distribution. 
Thus, globular cluster formation solely in major mergers is 
consistent with the available observations. Finally, our results 
rest on the derived prescription for the cold gas fraction as 
a function of halo mass and cosmic time. This prescription 
(Fig-ID can be tested by future observations of high-redshift 
galaxies with JWST and by detailed hydrodynamic simula- 
tions. 
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copy of the original KMM code, Karl Gebhardt for a discus- 
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ability table of the Dip test, Evan Kirby for a discussion 
of metallicities of the ultrafaint dwarfs, and Eric Bell, Nick 
Gnedin, Andrey Kravtsov, and Lee Spitler for comments on 
the manuscript. 
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APPENDIX 



QUANTIFYING BIMODALITY 

Quantifying whether a distribution is better described by one or two modes is still an unsolved problem in statistics. While 
there are algorithms that split the input distribution into two modes or assign probabilities that a given data point belongs to either 
of the two modes, there is no proper statistic that evaluates whether such a split is preferred to a unimodal distribution. In this 
Appendix, we describe a popular KMM algorithm and our improvement of it, as well as an independent test of bimodality based 
on the Dip statistic. 



GMM - A better version of KMM 

lAshman et al.l ( 119941) popularized a mixture modeling code KMM for detecting bimodality in astronomical applications. This 
code has been widely used for globular cluster studies and can be considered a standard method in the field. The KMM algorithm 
assumes that an input sample is described by a sum of two Gaussian modes and calculates the likelihood of a given data point 
belonging to either of the two modes. It also calculates the likelihood ratio test (LRT) as an estimate of the improvement in 
going from one Gaussian to two Gaussian distributions. However, the LRT obeys a standard \ 2 statistic only when the two 
modes have the same width (variance), which may not be satisfied by real datasets. Even though the probability of the LRT can 
be estimated using bootstrap in principle, in practice the use of the KMM c ode h as been limited to common width modes (the 
so-called "homoscedastic" case). Brodie & S trader (2006) and Wat ers et~ail (|2009) provide further discussion of KMM. 

The KMM method belongs to a general class of algorithms of Gaussian mixture modeling (GMM). GMM method s maximize 
the lik elihood of the data set given all the fitted parameters, using the expectation-maximization (EM) algorithm (e.g., Pre ss et alj 
l200l . A major simplification, which allows one to derive explicit equations for the maximum likelihood (ML) estimate of the 
parameters, is that each mode is described by a Gaussian distribution. 

For simplicity, and as appropriate for the metallicity distribution, we consider a univariate input data set. However, the algorithm 
is fully scalable to multivariate distributions. The likelihood function of a univariate sample x„ is 



C K = Yl\^2PkN(x„\fi kl a k )j, (Al) 

where 



, k=\ 



N{x\ l x,a)= (27Rr2)1/2 e*P 
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is the Gaussian density. The modal fractions are normalized as J^kPk = 1- A unimodal distribution (K = 1) has two independent 
parameters {p, and er), whereas a bimodal distribution has five parameters (pi,/tti,<7i,/X2,(72)> since pi = l—p\- 

The power of the GMM method lies in its ability to determine the ML values of the parameters (pt, jujt,<7fc). The disadvantage is 
that the method will always split the data set into the specified number of modes, K. In order to detect bimodality it is extremely 
important to be able to judge whether the bimodal fit is an improve ment over the unimodal fit. For this purpose the KMM code 
uses the LRT test, which appears to be an approximation derived by Wolfi (119711) for the homoscedastic cas e (a\ = cri). D efine 
the ratio of the maximum likelihoods as A = £i,max/>C2,max- According to numerical Monte Carlo studies of I Wolf d (1197 lh . the 
statistic -2 In A approximately obeys the \ 2 distribution with a number of degrees of freedom equal to "twice the difference 
between the number of parameters of the two models under comparison, not including the mixing proportions" ( McL achlanl 
119871) . This is the xi distribution in our case. However, the statistic does not apply in the heteroscedastic case <j\ =f 02 (it would 
have been xf)- Note that this unusual number of degrees of freedom was found as an empirical approximation. Unfortunately, 
no exact estimation exists for the goodness of modal split. 

Several variations of the method have been suggested in the literature. Mc Lachlanl (Q987) proposed a parametric bootstrap 
to test for the number of components. In this method, a test sample is drawn randomly from a unimodal Gaussian distribution 
with the parameters {/i, a] best-fitting the input sample. The number of objects in the test sample is taken the same as in the 
input sample. The bimodal split is calculated for this test sample using the EM algorithm and the likelihood ratio Ab 00 t is saved. 
Repeating the bootstrap a large number of times, we obtain the probability of randomly drawing the ratio as large as that observed 
in the input sample, A b s . If the probability is below a few percent, we reject the null hypothesis that the input sample belongs to 
a unimodal Gaussian. 

However, the parametric bootstrap is not a perfect solution. In the limit of a large number of objects in the input sample, the 
likelihood function is very sensitive to outliers far from the center of the distribution. Simple measurements errors in the wings 
of the Gaussian function may cause a unimo dal distribution to be rejec ted, even if it is correct. In other words, GMM is more a 
te st of Gau s siani ty than of unimodality (see Muthen 2003; Bauer 2007, for more discussion). 

lLo et al.l (120011) proposed a modified LRT method to test for the true number of components of a Gaussian mixture. The 
mo dified statistic must be evaluated numerically, but still does not address the problem with Gaussian wings. Subsequently, 
ILoI d2008) suggested to use the standard LRT with the parametric bootstrap to test for heteroscedastic split, and also suggested 
restricting the ratio of the standard deviations of the two mod es to be not less tha n 0.25, to avoid numerical artifacts. Such a 
method was recently implemented in globular cluster studies bv lWaters et al.l (120091) . 

The sensitivity of LRT to the assumption of Gaussian distribution calls for additional, independent tests of bimodality. A useful 
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and intuitive statistic is the separation of the means relative to their widths: 



P= lMl ~ M2 ' ■ (A3) 



We use the factor \/2 for consistency with the definition in Ashman et al] (119941) . who noted that D > 2 is required for a clean 
separation between the modes. If the GMM method detects two modes but they are not separated enough (D < 2), then such a 
split is not meaningful. The power of GMM in this case is counterproductive. A histogram of such a distribution would show no 
more than two little bumps, which would not be recognized as distinct populations. 

Another simple statistic is the kurtosis of the input distribution. A positive kurtosis corresponds to a sharply peaked distribution, 
such as the Eiffel Tower. A negative kurtosis corresponds to a flattened distribution, such as a top hat. A sum of two populations, 
not necessarily Gaussians, is broader than one population and therefore has a significantly negative kurtosis. However, kurt < is 
a necessary but not sufficient condition of bimodality. A broad unimodal distribution, such as an actual top hat, also has negative 
kurtosis. Therefore, kurt < is only useful as an additional check to support the results of LRT and the D-value. 

In order to provide a more robust measure of the modal split, we have revised and implemented the GMM algorithm inde- 
pendently of the KMM code. We begin with a single run of the EM algorithm to calculate the means and standard deviations 
assuming a heteroscedastic bimodal distribution. Then we repeat the estimation assuming a unimodal Gaussian case. We take 
the ratio of the likelihoods A, the separation D, and the kurtosis as the three statistics of interest. We then estimate the error 
distribution for the modal parameters using non-parametric bootstrap (drawing from the input sample with repetitions) of 100 
realizations. We also run the parametric bootstrap to assess the confidence level at which a unimodal distribution can be rejected 
based on each of the three statistics. Its practical application for an input sample of more than 100 objects is limited to about 
1000 bootstrap realizations, limiting the confidence level to ~ 10~ 3 . A sufficiently low probability of each statistic means a uni- 
modal distribution can be rejected in favor of a bimodal distribution. The code also calculates the probability of each data point 
belonging to either mode. 

A sum of two Gaussians with the same variance can sometimes be preferred to the case of different variances, because of the 
one fewer degree of freedom. The choice between the homoscedastic and heteroscedastic cases can be similarly made using LRT, 
but we feel that it is less important than choosing between a bimodal and unimodal distributions. For comparison with the KMM 
code, we calculate the homoscedastic split and its approximate probability using the \\ statistic. We also calculate an alternative 
split into two Gaussian modes with the same mean but different variance. This is a more extended distribution than a single 
Gaussian, which may be a better fit for a unimodal but non-Gaussian sample. 

The steps of our algorithm are summarized below: 

1. Calculate {/i,er} and {/»i,/ii,cri, /Lt 2 ,cr 2 } using a single EM run. 

2. Form three statistics: A, D, and kurt. 

3. Run non-parametric bootstrap to estimate the errors: A/i^, Ao>, Ap\, and AD. 

4. Run parametric bootstrap to estimate the probability of a unimodal distribution, according to A, D, and kurt. 

As a first test of the algorithm, we verified that it reproduces exactly the test output of the KMM code, given the test input 
provided with the code. 

We also made two random realizations of a unimodal Gaussian distribution, A^(0, 1), with 150 objects and 1500 objects, respec- 
tively. The smaller sample has the mean and standard deviation of /i = -0.088 and a = 0.987, within the intended target given the 
sample size. Indeed, a non-parametric bootstrap gives A/i = 0.084, Act = 0.063. The kurtosis of the input sample is kurt = 0. 104. 
A heteroscedastic split gives two peaks, by construction, with /ii = -0.483, o\ = 1 .206 and /i 2 = -0.032, er 2 = 0.938. However, the 
split is not statistically significant. The likelihood is improved only by -2 In A = 0.26 relative to the unimodal case, which gives 
the probability better than 99% that the input sample is unimodal. The parametric bootstrap gives a similar probability of 96%. 
The separation of the peaks also leads to the same conclusion: D = 0.42 ± 1 .46. The parametric bootstrap probability of drawing 
such value of D randomly from a unimodal distribution is 87%. The probability of drawing the measured kurtosis is 74%. Thus, 
all three statistics show correctly that the input distribution is not bimodal. The larger test sample has smaller parameter errors, 
as expected, but similar significance levels from the parametric bootstrap. 

We then apply the GMM algorithm to the sample of observed metallicities of the Galactic globular clusters. A unimodal fit gives 
/i = -1 .298 ± 0.049 and a = 0.562 ± 0.028, where the errors are calculated with the non-parametric bootstrap. A heteroscedastic 
split gives /i, =-1.608 ±0.064, a\ =0.317 ±0.051 and ^ 2 = -0.583 ± 0.074, a 2 = 0.281 ± 0.075. Of the total number of 148 
clusters, 103 (or 70%) are in the metal -poor group and 45 (or 30%) are in the metal-rich group. A homoscedastic split gives 
m = -1 .620 ± 0.037, H2 = -0.608 ± 0.055, and a x = er 2 = 0.303 ± 0.026. In this case, there are 101 metal-poor clusters and 47 
metal-rich clusters. In either case, the likelihood improvement in the 1000 parametric bootstrap realizations is never as high as 
observed, -2 In A = 27.5. That is, a unimodal distribution is rejected at a confidence level better than 0.1%. The separation of the 
peaks is also very clear, D = 3.42 ± 0.47. The observed cluster distribution is indeed bimodal! 

When we apply the GMM algorithm to our model sample, we also find strong bimodality. In order to keep the sample size 
similar to the data, we test separately each of the 1 1 random model realizations of each host halo. The resulting parameters form 
a set from which we calculate the average parameters. To estimate both the statistical and systematic errors of each parameter, 
we sum in quadrature its mean standard deviation in the set and the scatter among the realizations. A heteroscedastic split 
gives |U, = -1 .543 ± 0.045, a x = 0.3 18 ± 0.043 and ^ 2 = -0.576 ± 0.077, cr 2 = 0.237 ± 0.026. The break-down is 66% (±4%) of 
the clusters in the metal-poor group and 34% in the metal-rich group. The average separation of the peaks is very significant, 
D = 3.44 ± 0.29. None of the 1000 parametric bootstrap sets of a unimodal distribution returned any of the three statistics (A, D, 
kurt) as high as those found the model. Again, a unimodal distribution is rejected at a confidence level better than 0.1%. Thus, 
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our fiducial metallicity distribution is bimodal as well. 

A d ifferent methodology to the LRT in the Bayesian approach is the Odds Ratio, or the Bayes factor described in iLiddld 
(2009). The Odds Ratio is the ratio of the integrals of the likelihood function C over all possible ranges of the model parameters 
with the corresponding normalized distribution functions of the parameters. In the case of one-Gaussian vs. two-Gaussians, it is 
J £2P(p\)P{[i\)P(cr\)P{[J.2)P(o'2)dp\dfiid<T 1 diji2da2/ J C\P{p)P(a)dpdcr. The Odds Ratio is dimensionless and is greater than 1 
if the modal split indeed significantly improves the likelihood £2 over C\ . On the other hand, if the improvement in the likelihood 
is small, it can be washed out by integration over the additional parameters and the Odds Ratio becomes less than 1 . We have 
experimented with using the Odds Ratio as an objective criterion for the goodness of modal split but found that its application 
depends sensitively on the adopted range of the parameters and their (unknown) distribution functions. Therefore, we have not 
included it in our GMM code. 

Dip test 

A completely i ndependent test of unimodality w as proposed by Harti gan & Hartigan! (11985b . It was first used for globular 
cluster studies by Gebhardt & Kissler-Patig (1999). The Dip test is based on the cumulative distribution of the input sample. 
The Dip statistic is the maximum distance between the cumulative input distribution and the best-fitting unimodal distribution. 
In some sense, this test is similar to KS test but the Dip test searches specifically for a flat step in the cumulative distribution 
function, which corresponds to a "dip" in the histogram representation. The probability of rejecting a unimodal distribution is 
calculated empirically and tabulated as a function of sample size. We obtained an updated table of the probabilities calculated 
recently by Martin Maechler (www.cran.r-project.org/web/pac kages/diptest). 

We have added a driver routine to the original Fortran code of Hartig an & Harti gan ( 1985). Our code interpolates the probability 
table for any input sample size up to 5000 objects. Looking just at the significance levels, the Dip test appears less powerful than 
GMM. The Dip probability of the observed Galactic sample being bimodal is 90%, whereas the LRT probability is 99.998% and 
the parametric bootstrap probability is 99.9%. However, the Dip test has the benefit of being insensitive to the assumption of 
Gaussianity and is therefore a true test of modality. It is also muc h faster to run than the GMM code. We use the Dip probability 
for assessing bimodality of our model realizations in Section |4~4l 

Note that we searched for a statistical method that evaluates bimodality using full input informat ion, without having to bin the 
data. Another method, RMIX, based on a histogram of the input sample was recently suggested bv lWehner et al.l (120081) . 

Source codes of the GMM and Dip tests are available from one of the authors (OG) upon request. 
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